Write to hdfs from elasticsearch


#1

Is it possible to write from elasticsearch to hdfs?

the current pipeline is:
fluentd -> elasticsearch
fluentd -> flume -> hdfs

elastic is used for realtime analysis and hive (on top of hdfs) is used for longer.

i would like to replace this setup with:
fluentd -> elasticsearch -> hdfs

but i have not been able to confirm whether this is possible/feasible.

do anyone have experience with this, or can point me to the docs?

thx.


(James Baiera) #2

If you want to extract data from Elasticsearch into a file format on HDFS you can use any of the provided mechanisms in the ES-Hadoop package to target the Elasticsearch indices you want to export and target HDFS as the desired output location.

You can realize this in a variety of ways, one of which includes creating an external Hive table (backed by ES) and a new Hive table (backed by HDFS) and doing an INSERT ... SELECT statement to transfer the data.

For more information on ES-Hadoop, you can take a look at the documentation pages here.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.