My situation: I have 2 logstash-forwarders sending network devices logs to an ELK server. In the past I did create the needed indexes based on time ( YYYY+MM+DD), but I detected that the performance of the server goes down with this config, and i changed it to static indexes, the performance problem was resolved with this change.
But, now i have big indexes on the elk server and i need retrieve data from them to migrate or delete it.
at the momment i was deleting the indexes and start them again from the scratch , but i want to have capability and the knowledge to select a specific info from an index and make different actions with it.
Reindex has a "Reindex from Remote" functionality, that basically streams the data from one index to another on a different cluster. That will give you the flexibility to only select some data, subsets, certain types, etc
If you just want everything, Snapshot and Restore would be faster. But it's less flexible in what it can capture (e.g. it'll grab entire indices)
Thanks for your answer and sorry for take so much time to come back to this topic.
Is possible perform the same operation than " Reindex From Remote" but delete the data directly and not migrate it to another cluster ?
Is not my intention migrate the info, just delete it, but keeping the last 3 months.
You can reindex into the same cluster, which allows you to move data from one index to another. You can then delete the old index if you want, or run a DeleteByQuery, etc.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.