Hi all,
I have 2 questions about retrieval and scoring,
How deep when ES retrieving documents, even without scoring? By now the information I got was all. please help me to ensure this mechanism. Actually when I was developing a web search engine, normally the retriever would interrupt when it thinks there are "enough good" candidate documents for this search, likely, just top10 in 10,000 docs.
The search result ranking for one same query on one static index, stable or not? For most times it is stable, but sometimes ES returns different results, I was guessing this is caused by some bad shards, but not sure. Any help?
Thanks mark:) ,
The chapter explains how "dispatcher & gatherer" works, but I want to know when the gatherer knows its own private priority queue is fulfilled. (I guess, It can't just retrieve exactly from+size docs, the pagination can't be stable if so, and also it seems doesn't retrieve all docs since the book says "deep paging is a problem" - sorting won't be a problem when you retrieved all docs, I think) https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html
Yes I understand that part, I'd like to know how a shard chooses its top10 result, I mean, the progress of building this priority queue(from + size, 10 by default)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.