Difference between sorting in index level and inside query

Hi,

What's the difference between sorting according to a specific field (adding a sort clause in the query itself) and adding sorting in the index definition (meaning, sorting it on the segment level), without adding sort clause to the query?

For example, what will be the performance difference between searching on index that was created with the following settings (like here):

"settings" : {
    "index" : {
        "sort.field" : "my_date_field", 
        "sort.order" : "desc" 
    }
}

And searching on an index that was created without such settings, by using:

"sort": [
{
  "my_date_field": "desc"
}
]

Theoretically, I would expect ES to use the (sorted) inverted index in both cases, which means that both cases should have similar performance.

If the inverted index isn't sorted, how does ES performs range queries efficiently?

Thanks!

Anyone? :slight_smile:

Please see the documentation at https://www.elastic.co/guide/en/elasticsearch/reference/7.4/index-modules-index-sorting.html - the first is sorting the data when writing it to disk.

While this means an additional step needs to be taken during indexing and thus throughput will be reduced, it also means that when searching for data and the query sorting is the same than the index sorting, that searches will be much faster, as searching pre sorted data means possibility of early termination during a search.

Hope this helps.

I think what @liorisme6 was asking, is about the difference between defining index sorting and not at all. i.e, can you explain how elastic will sort, lets say, numeric field when using:

"sort": [
{
  "my_date_field": "desc"
}]

and without defining index sorting.
I would expect ES to use the (sorted) inverted index (when the query is using the same field "my_date_field" of course). Isn't that the case?

yes, that is where the speed up is coming from.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.