Hello,
I'm trying to implement pagination for a service that pulls data out of Elasticsearch by using the search after functionality. Data needs to be sorted by only one field, but, as the docs mention, a field with unique values must be used as a tiebreaker.
I first tried using _uid
, as suggested in the docs, but it's a pretty big performance drawback, as it does not have doc_values
enabled on it.
I then switched to using another keyword-type field, which made it better, but it still takes a lot more time and memory than a single sorting field.
Does anyone have a suggestion on the second field I should use to minimize the second sort overhead? Or any other way that can make the search_after paging work.
Also, is there any way I can measure how much time the second sort takes? Profile API only shows query stats, but no sorting data, as far as I can see.
I'm using Elasticsearch version 5.5, on a cluster with 3 data nodes, 3 coordinating nodes and 3 master nodes. The data is split with an index per time frame approach (index configuration: 9 shards, 2 replicas), each one holding 2 weeks worth of data, with about 30 million docs each.
Thanks!