I am looking into possible integration of Elasticsearch and Hadoop using es-hadoop. Would like to know if it can support Avro format from/to Hadoop and Elasticsearch ?
If yes, are there any requirements / plugins needed to support or some special configuration requirements.
If it is not supported currently, are there any plans for this to be available ?
REGISTER piggybank.jar and read your Avro data and then Store it on your ES cluster with EsStorage!
Eg:
REGISTER piggybank.jar;
records = LOAD '/input/data' USING org.apache.pig.piggybank.storage.avro.AvroStorage('no_schema_check',
'schema_file', 'examples/schema/test.avsc');
STORE records INTO 'library/book' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.timeout=5m','es.index.auto.create=false' );
Cheers,
Vijay
Thanks for the example @vijaym123. es-hadoop supports any data format
supported/available in Hadoop. As in the example below, simply use the
appropriate Storage/Input/OuputFormat/Loader and you're set.
This is consistent across all the libraries supported by the connector -
Map/Reduce, Hive, Pig, Spark, Storm, etc...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.