Sorted scroll internal workings for performance optimizations

Hi, could please anybody confirm my hypothesis regarding sorted scroll?

  1. I assume that normal sorted search :
  • loads all fields we sort on into memory in case of fieldData, in case of doc_values we are just not limited by heap (but hard to imagine sorting without using memory :slight_smile:
  • returns all hits sorted
  1. I can't really imagine how sorted scrolling is implemented though because you kinda have to sort say billion documents over a doc_values field, at least partially. Until then scrolling cannot be started, because you wouldn't know what really is the first document to fetch, right? BUT scrolling starts returning sorted results immediately although very slowly (800 docs/s) which is 10 times slower than scanning the same thing.
  • what is the bottleneck here? I have like 7% IO wait, which eliminates issue with doc_values, low network traffic but CPU is at 98% which seems like there was a sorting thread at the background providing scrolling/searching thread with documents to retrieve. If it is like this, is this sorting thread a bottleneck? Meaning that reading 300M doc_values fields and sorting them is IO & computational heavy operation that makes it 10 times slower ?
  • Is there any way to improve it? Since all fields in my documents are doc_values and it is search/scroll only node. I assume that I should split memory available to something like 30%-heap/70%-off-heap, right? Providing sufficient amount of OS ram for FS cache. Is there anything else I'm not aware of?

Hi,

The way sorted scroll works is that elasticsearch remembers the last returned document on every shard. Then when you ask for the next page, all documents that match your query and compare better than this last returned document are ignored.

This explains why sorted scroll is less efficient than scan scroll, since scan can just go back to where it had stopped before, collect ${page_size} documents and return. In contrast, sorted scroll needs to visit all matching documents for every page.

This part is interesting, can you try to capture the hot threads output and share it? This could give interesting information about where your CPU is spending time.

Thank you @jpountz, before I run it again (I have a different 10hour job running now), could you please give me some optimization pointers regarding sorted scrolling with doc_values only? I guess that :

  1. FieldData cache is irrelevant here because of the doc_values
  2. filter cache is still important, but I don't think that it needs more than 10%, or does it?
  3. Is 30%-heap/70%-off-heap ratio optimal for search-only node with doc_values fields only? Since everything is in PageCache/Fs cache instead of heap

Indeed the fielddata cache is mostly irrelevant if you have doc values, unless you run eg. terms aggregations which need to load a global ordinal map on top of doc values and store it in the fielddata cache.

10% is indeed already a lot for the filter cache. If you want to optimize its usage, a better idea to increasing its size is usually to explicitly disable caching on filters which are not reused -- by the way Elasticsearch 2.0 will cache filters much less aggressively for that reason.

If your node can handle the load at a 30/70 ratio, then it sounds great. In general, the more you can give to the filesystem cache, the better.