I use transport client to sent a search request to an ES index which has 5 shards, request parameter: from=0, size=10. And then, I got 50 hits back (10 hits per shard, I think). But if I changed from parameter to something greater than 0, then I got only 10 hits back (that's what I expect).
This sounds like Pagination. Be it 10 results from the 5 shards. Can you share more on the exact search you are running and what you wanting to achieve.
What I want to achieve is that if I search with from=0 and size=10, then I should get only 10 hits, not 50 hits. And I'm sure it's something about shard. I tried to change index shard num to 3, then I got 30 hits.
@ji_luo What are you hoping to achieve here? You mention the transport client. Is this a one time query to get the top ten? (example) Top ten movies with the fields you have provided?
What will your application be doing with the queried data?
In case other people may encounter the same problem, I will write down the reason here.
The problem is caused by different search types, if I use default search type query_then_fetch, the returned hit count is as I expected, but if I use query_and_fetch, it will return num_of_shards * size hits, more than I want, it adds more burden on network, and increases response time.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.