Hadoop-Elasticsearch - Avro Support


(Ajaybhatnagar) #1

Hi,

I am looking into possible integration of Elasticsearch and Hadoop using es-hadoop. Would like to know if it can support Avro format from/to Hadoop and Elasticsearch ?

If yes, are there any requirements / plugins needed to support or some special configuration requirements.
If it is not supported currently, are there any plans for this to be available ?

Thanks
Ajay


(Vijaym123) #2

REGISTER piggybank.jar and read your Avro data and then Store it on your ES cluster with EsStorage!

Eg:

REGISTER piggybank.jar;
records = LOAD '/input/data' USING org.apache.pig.piggybank.storage.avro.AvroStorage('no_schema_check',
'schema_file', 'examples/schema/test.avsc');
STORE records INTO 'library/book' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.timeout=5m','es.index.auto.create=false' );
Cheers,
Vijay


(Ajaybhatnagar) #3

Thanks Vijay
We will give it a try.
Ajay


(Costin Leau) #4

Thanks for the example @vijaym123. es-hadoop supports any data format
supported/available in Hadoop. As in the example below, simply use the
appropriate Storage/Input/OuputFormat/Loader and you're set.
This is consistent across all the libraries supported by the connector -
Map/Reduce, Hive, Pig, Spark, Storm, etc...


(system) #5