Hello All
i am using 3 node elasticsearch(7.4.2) cluster in my environment and generating 100GB(appx) data daily, hence i would like to maintain only 30days data in elasticsearch and remaining data should be stored in hadoop(Ambari 2.7.0 - hdfs 3.0.0).
i found 'Hadoop HDFS Repository' plugin for doing the job
but could not able to use it.
can someone please share complete steps for completing the above requirement.
Note: i have installed plugin in all the three elasticsearch nodes
sudo bin/elasticsearch-plugin install repository-hdfs
unable to understand the below query usage
PUT _snapshot/my_hdfs_repository
{
"type": "hdfs",
"settings": {
"uri": "hdfs://namenode:8020/",
"path": "elasticsearch/repositories/my_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true"
}
}
you want to store elastic search index data into HDFS ?or HDFS Data into elastic search?
if you want to store data into HDFS then create Hive external table and point on your elastic search index data.then use CTAS to store external hive table into another hive table.
what mean by maintain only 30days data in elasticsearch ?
hi Ramesh
i want to store Elasticsearch data to Hadoop
i have found 'Hadoop HDFS Repository Plugin' for doing this job
but i was stuck while executing the following command from kibana(dev tools)
i am new to hadoop and using hadoop 3 node cluster using ambari.
do we need to create any path in hadoop cluster
or suggest hadoop configuration changes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.