Elasticsearch "size: 100" operator internal behaviour

ypetrovic · May 4, 2017, 1:48pm

Hello,

For my thesis I'm currently investigating the speed (down to milliseconds) of Elasticsearch and some other NoSQL database systems.

My question, rather technical mind you, is: What is the internal behaviour when using the size operator in a search query?

I've noticed that, compared to other database systems, Elasticsearch is very consistent when it comes to the speed at which it returns data and the total items found. Where other databases take a longer time to return data the more results are found, Elasticsearch's response time is almost always the same, regardless of the total amount of requests sent.

My hypothesis is that in Elasticsearch, when using the size operator, the number of documents that are actually looked up and retrieved after the search in the indexes is finished is exactly the amount set in the size operator. Where in other database systems this is not the case, in these database systems all documents that matched in the index are retrieved, and only the top X amount is eventually returned to the client.

I have no way, other than to spend hours looking through the source code, to figure out if this hypothesis is correct, or if this is something that can be found in the Lucene documentation?

Thanks for taking the time to read this, any responses are appreciated and will help me further my research.

system · June 1, 2017, 2:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How does "size" work under the hood? Elasticsearch	2	331	July 6, 2017
Understanding search response Elasticsearch	2	551	July 6, 2017
Size from or Search After Elasticsearch	1	487	February 28, 2019
Size query performance problems Elasticsearch	2	396	April 3, 2018
High performance penalty, when size in query is increased Elasticsearch	3	513	October 30, 2018

Elasticsearch "size: 100" operator internal behaviour

Related topics