Blog do projektu Open Source JavaHotel

czwartek, 29 grudnia 2016

IBM InfoSphere Streams, BigSql and HBase

Introduction
Some time ago I created Java methods to load data into HBase using format understandable by BigSql. Now it is high time to move ahead and to create IBM InfoSphere operator making usage of this solution.
The Streams solution is available here. The short description is added here.
JConvert operator
JConvert operator does not load data into HBase, it should precede HBASEPut operator.
JConvert accepts one or more input streams and every input stream should have corresponding output stream. It simply encodes every attribute in the input stream to blob (binary) attribute in the output stream. The binary value is later loaded to HBase table by HBasePut operator.
Very important factor is to coordinate JConvert input stream with target HBase/BigSql table. Neither JConvert nor HBasePut can do that, if attributes and column types do not match then BigSql will not read the table properly. Conversion rules are explained here.
TestHBaseN
This operator is used for testing, it also contains a lot of usage examples.
Simple BigSql/HBase loading scenario.



On the left there is the producer, then JConvert translates all input attributes into binary format and HBasePut operator load binaries to HBase table.
More details about TestHBaseN.

Brak komentarzy:

Prześlij komentarz