I am reading data from elastic search through external table in hive. I want load data into HDFS from ES continuously instead of doing an external query in ES, and then fetch data into hive from HDFS. Ultimately, I want to fetch data from hive to tableau .
Please suggest a way to strore data from ES to HDFS.
@rrai I'm afraid that ES-Hadoop does not support continuously streaming data from Elasticsearch yet. Once all data is consumed from Elasticsearch, the job will conclude. You could instead try doing a regularly scheduled export process that only targets the most recently changed data. You can do this by specifying an ES query to run your job with using the es.query setting: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html#_querying
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.