I am using ELK, and some of my indexes are getting large. I would like to delete some documents that fir provided timeframe. For example, delete all documents in certain time range.
I am using 5.3.2 ElasticSearch
If you are writing into a single index, and not using a time-based one, e.g. logstash-2017.07.06, you will need to use delete-by-query which Mark linked to. This is generally much less efficient than simply dropping whole indices if you are using indices per time period.
You suggest that I create an index per day worth of logs. For example, in my folder application generates logs with a datestamp, something like: mylog.log.2017-06-06.
And then I should create an index with the same name. Doing that, I can simply delete indexes that I do not wish.
Did I understand it correctly?
I see your point. But if I were to make a search on some of the indexes, and I would like to make deeper (several months in the past) search, then would it not complicate my queries?
If I have indexes based on days, than I would take all of them in to account. Or even if I have them broken by months, then also if I wish to do annual search, than I would need to group the somehow?
Currently, I am using Kibana to show me some statistics for one year or so.
Wow, I did not know that. So I might have 365 indexes, which is a Year worth... and Kibana will simply ignore it? I mean figure it out and make whatever 'joins' internally?
If your data volumes allow for a single index, I would probably recommend switching to monthly rather than daily indices. As data is allocated to indices based on timestamp, Kibana can limit the number of indices queried to only the ones that hold data relevant to that period, which can result in less data queried.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.