This is working as designed.
Sort order is valid within partitions, not across them.
It’s a coping strategy for when your data is distributed across lots of machines which makes it difficult to perform certain operations in a single search. It relies on your client application stitching together results from multiple partitions where appropriate.
The challenge is data locality.
Unless you have all data related to a seller kept on the same shard it’s harder to reason about each seller. Time-based indices make this harder. The fact you’re sorting by lowest doc count makes this even harder.
If you can put all docs related to the same seller in one shard there’s hope. “Custom routing” or the “transform” api are 2 ways to ensure information related to each seller is available in only one shard.
Hi @Mark_Harwood, I have 1 shard and 1 replica for my index then it should work right? Descending order on count is also not working even I have only one shard for my index.
I meant Descending sorting on count is not working across all data. I m looking for a way that returns descending sorted data on _count across all partitions.
As per your comment, I understood Sorting may not work across all data if data resides in different shards/machines. But here my index configuration is 1 shard. please suggest if this is possible by changing any configuration.
But I want a Paginated bucket. result query may return more than 1 lakhs bucket but at a time it should return 15 buckets this is the reason I am using partitions.
Deep pagination on a derived value is tough because the deeper you go into the smaller values you have to hold all the other keys and derived values in memory just to figure out where you are in the global order.
Maybe the solution is turn the derived value into a concrete one using the “transform” api to create a derived index holding the totals for each key. This can be queried and sorted on the total field using search_after param. This can be implemented efficiently because it requires less memory but does not support direct indexing into the sort position (eg randomly picking a page number from the results )
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.