Sorting by the opposite direction is very slow

Jason_Baumgartner · October 11, 2018, 9:40pm

I have an index where I used the new Lucene feature of indexing on a specific field. In my mapping, I have the following:

"sort.field": "created_utc",
"sort.order": "desc",

When I request documents using the same sort order, matches come back extremely fast. For example, I am searching Reddit submissions with an API I created:

https://beta.pushshift.io/reddit/submission/search/?sort=desc&size=1&pretty=true&metadata=true

If you look at the metadata->elasticsearch->took value, this search completes in milliseconds.

However, if I sort in the opposite direction, the search time is in seconds (usually over 10):

https://beta.pushshift.io/reddit/submission/search/?sort=asc&size=1&pretty=true&metadata=true

From what I know about binary search, searching on a sorted array is a O(log N) operation. However, sorting in the opposite direction should also be nearly as fast. Is this a bug with Lucene's implementation of their index?

jimczi · October 15, 2018, 11:41am

From what I know about binary search, searching on a sorted array is a O(log N) operation. However, sorting in the opposite direction should also be nearly as fast. Is this a bug with Lucene's implementation of their index?

Sorting the result of a search is not a binary search. We collect all documents that match the query in the order of their appearance in the index and keep track of the N best ones with a priority queue. If the index sort matches the search sort we can early terminate the query when N documents are collected, otherwise all documents must be collected. The reverse index sort order is the worst case since it requires to change the top N every time we visit a new document. There was some discussions to optimize this case in Lucene though ([LUCENE-7482] Faster sorted index search for reverse order search - ASF JIRA) with some interesting ideas. The issue is quite old so it could be interesting to open a new one that focuses on reverse index sort.

system · November 12, 2018, 11:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issue with Index Sort -- The sort should work just as fast in either direction Elasticsearch	1	329	August 21, 2019
Elastic Search Sorting takes time between 4-5 secs Elasticsearch	3	2707	July 6, 2017
Questions about default sorting！！！ Elasticsearch	3	312	August 6, 2020
How to sort on small cluster with 100m+ documents without OOME Elasticsearch	2	293	July 6, 2017
Performance difference between searching with one field against with two fields Elasticsearch	6	1300	July 5, 2017

Sorting by the opposite direction is very slow

Related topics