How do we store elasticsearch data into Hadoop cluster(Amabri)

Hello All
i am using 3 node elasticsearch(7.4.2) cluster in my environment and generating 100GB(appx) data daily, hence i would like to maintain only 30days data in elasticsearch and remaining data should be stored in hadoop(Ambari 2.7.0 - hdfs 3.0.0).
i found 'Hadoop HDFS Repository' plugin for doing the job
but could not able to use it.
can someone please share complete steps for completing the above requirement.

Note: i have installed plugin in all the three elasticsearch nodes
sudo bin/elasticsearch-plugin install repository-hdfs

unable to understand the below query usage
PUT _snapshot/my_hdfs_repository
{
"type": "hdfs",
"settings": {
"uri": "hdfs://namenode:8020/",
"path": "elasticsearch/repositories/my_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true"
}
}

how to configure path etc value.
please help me

Hi @kishore419.

you want to store elastic search index data into HDFS ?or HDFS Data into elastic search?

if you want to store data into HDFS then create Hive external table and point on your elastic search index data.then use CTAS to store external hive table into another hive table.

what mean by maintain only 30days data in elasticsearch ?

Please explore more ...

Thanks
HadoopHelp

hi Ramesh
i want to store Elasticsearch data to Hadoop
i have found 'Hadoop HDFS Repository Plugin' for doing this job
but i was stuck while executing the following command from kibana(dev tools)

PUT _snapshot/my_hdfs_repository
{
"type": "hdfs",
"settings": {
"uri": "hdfs://namenode:8020/",
"path": "elasticsearch/repositories/my_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true"
}
}

i am new to hadoop and using hadoop 3 node cluster using ambari.
do we need to create any path in hadoop cluster
or suggest hadoop configuration changes.

let me know if you have any questions.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.