About the retrieval depth & ranking

sanzhiyuan · June 6, 2016, 12:32pm

Hi all,
I have 2 questions about retrieval and scoring,

How deep when ES retrieving documents, even without scoring? By now the information I got was all. please help me to ensure this mechanism. Actually when I was developing a web search engine, normally the retriever would interrupt when it thinks there are "enough good" candidate documents for this search, likely, just top10 in 10,000 docs.
The search result ranking for one same query on one static index, stable or not? For most times it is stable, but sometimes ES returns different results, I was guessing this is caused by some bad shards, but not sure. Any help?

Thanks all ^.^

warkolm · June 6, 2016, 11:01pm

https://www.elastic.co/guide/en/elasticsearch/guide/current/distributed-search.html may help clarify this.

sanzhiyuan · June 7, 2016, 2:47am

Thanks mark:) ,
The chapter explains how "dispatcher & gatherer" works, but I want to know when the gatherer knows its own private priority queue is fulfilled. (I guess, It can't just retrieve exactly from+size docs, the pagination can't be stable if so, and also it seems doesn't retrieve all docs since the book says "deep paging is a problem" - sorting won't be a problem when you retrieved all docs, I think) https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html

warkolm · June 7, 2016, 2:51am

It just grabs the number of docs (default 10) from each shard.

So if you have 5 shards, each provides the top 10, then the reduce phase takes that total of 50 and provides the top 10 from that.

sanzhiyuan · June 7, 2016, 3:03am

Yes I understand that part, I'd like to know how a shard chooses its top10 result, I mean, the progress of building this priority queue(from + size, 10 by default)

warkolm · June 7, 2016, 3:51am

Ah right, sorry! So that's this part - https://www.elastic.co/guide/en/elasticsearch/guide/current/sorting.html

Basically it scores anything that matches the query/filter.

sanzhiyuan · June 7, 2016, 3:58am

Thanks! this helps me a lot

Topic		Replies	Views
Intermittent scoring returned Elasticsearch	3	284	July 6, 2017
Elasticsearch/Lucene scoring broken? Elasticsearch	11	470	July 6, 2017
Custom score query && pagination Elasticsearch	7	1051	July 6, 2017
Elasticsearch Search data Elasticsearch	4	1129	December 10, 2019
Inconsistent scoring between nodes Elasticsearch	2	421	July 6, 2017

About the retrieval depth & ranking

Related topics