Delete documents by timestamp

I am using ELK, and some of my indexes are getting large. I would like to delete some documents that fir provided timeframe. For example, delete all documents in certain time range.
I am using 5.3.2 ElasticSearch

I use Kibana to issue queries to ES.

Thank you.

You need to use https://www.elastic.co/guide/en/elasticsearch/reference/5.4/docs-delete-by-query.html unless you are using time based indices.

Each of my documents has a @timestamp when it was indexed.

Are you using time-based indices?

I am sorry if I do not understand exactly, but I will try to answer :slight_smile:

Whenever a log is written as document, a @timestamp is appended at the time of the log creation.

If you are writing into a single index, and not using a time-based one, e.g. logstash-2017.07.06, you will need to use delete-by-query which Mark linked to. This is generally much less efficient than simply dropping whole indices if you are using indices per time period.

1 Like

Yes. I am writing in to single index.

This is what I have come up with:

POST myindex/_delete_by_query
{
"query": {
"range" : {
"@timestamp" : {
"gte" : "09/02/2017",
"lte" : "11/02/2017",
"format": "dd/MM/yyyy||yyyy"
}
}
}
}

And it works.

Thank you for your suggestions :slight_smile:

1 Like

You should really change to time based indices.

1 Like

Just to recap if I understand it correctly.

You suggest that I create an index per day worth of logs. For example, in my folder application generates logs with a datestamp, something like: mylog.log.2017-06-06.

And then I should create an index with the same name. Doing that, I can simply delete indexes that I do not wish.
Did I understand it correctly?

Yep!
Daily, weekly or monthly is better than a single index/

I see your point. But if I were to make a search on some of the indexes, and I would like to make deeper (several months in the past) search, then would it not complicate my queries?

If I have indexes based on days, than I would take all of them in to account. Or even if I have them broken by months, then also if I wish to do annual search, than I would need to group the somehow?

Currently, I am using Kibana to show me some statistics for one year or so.

Am I missing something? :slight_smile:

Thanks

No, Kibana has intelligence built into it to take this into account and will only query the indices it needs based on the requested timestamps.

Wow, I did not know that. So I might have 365 indexes, which is a Year worth... and Kibana will simply ignore it? I mean figure it out and make whatever 'joins' internally?

That's the high level idea, yes :slight_smile:

1 Like

Are there any other benefits to have many indexes vs one big? In terms of speed or performance?

It's much easier to manage retention.

But unless you have too many shards then there's no real difference.

If your data volumes allow for a single index, I would probably recommend switching to monthly rather than daily indices. As data is allocated to indices based on timestamp, Kibana can limit the number of indices queried to only the ones that hold data relevant to that period, which can result in less data queried.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.