How does order-by-field work (performance)?

ddorian43 · September 30, 2016, 8:54pm

Say I have 2 indexes with 100K documents and 1M documents.
Then I make a filter which returns 1K documents from each index and sorts by a biginteger field. Which one will complete faster ?
Meaning, is the sort speed based on the number of filtered documents or total doucments (assuming everything in ram) ?

nik9000 · October 1, 2016, 11:35am

If you mean an infinite precision field then that isn't a thing we have support for. Long is the best we've got.

The dominant factor is going to be the number of hits that who's value for that field is in a disk block that the OS hasn't paged in. If all blocks are paged in then the dominant factor is the number of hits.

Sorting (by a field or by score) works by making a min-heap the size of the number of hits and dumping all the hits into it, discarding hits that "fall out" or "don't fit". If you sort by _score then the query has to figure out some score and that is the sort key. If you are sorting by a field the queries skip figuring out a score and instead the sort key is the document's value for that field (or those fields if sorting by more than one field).

Getting the value for a field should be fairly fast this talk went into some detail about it but I can't figure the video at this point, sadly. It is really one of my favorite talks.

Topic		Replies	Views
Sorting my multiply fields: performance Elasticsearch	1	344	July 6, 2017
Sorting by Fields After Text Score Elasticsearch	2	157	October 17, 2022
Is omitting "sort" same as sorting by _doc? Elasticsearch	5	1212	July 5, 2017
Sorting the resultset Elasticsearch	4	292	July 6, 2017
Performance of sorting on nested field Elasticsearch	8	3801	July 5, 2017

How does order-by-field work (performance)?

Related topics