How to efficiently return most recent documents using index pattern by date

I am having trouble doing what I feel like should be really simple. I want to return the most recent 500 documents that match a query, using the JSON query language.

I tried just sorting on the date and then setting a limit, like this:

"size": 500,
"sort": { "@timestamp": { "order": "desc" } }

This times out. (Obviously I could increase the timeout, my question is as much about perf as it is about functionality. The timeout is already at 10 seconds, which seems generous for what I want to do.)

I am using an index pattern, so I have separate indices named based on the date, so it should be able to do this efficiently. I can get it to time out even if I run a query where there are 500 documents matching the query in the most recent index.

What appears to be happening is that is is gathering up all the documents for all time, sorting them, and then limiting them. This is obviously not an optimal query plan given the underlying data structure.

What do I need to do to enlighten Elasticsearch that, since I am asking for the N most recent documents, it should scan the indices in order and stop once the limit is fulfilled?

Which version are you using? I believe version 7 introduced improvements in this area.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.