Would docvalues be loaded into jvm?

wangqinghuan · June 15, 2017, 3:18am

hi
I know that data is written into disk with the style of column-store if I enable doc-values for certain field.
But I don't understand why sorting with docvalues doesn't increase the load of jvm. whatever sorting algorithm , data would be loaded into jvm to sort. This should be a high load for jvm when I sort all index , but no change for jvm in fact. How does lucene sort with docvalues ? Can sort algorithm work directly based on the file (Mmap) ?

rjernst · June 16, 2017, 1:55am

Essentially, yes. Lucene relies on the filesystem cache for keeping hot things in memory. The index files are loaded through mmap, so not directly into memory, hence it does not consume jvm heapspace.

wangqinghuan · June 16, 2017, 10:34am

when sorting with doc-values, does it works directly off filesystem cache? Sorting algorithm needs some addional spaces,does these spaces are malloced with filesystem cache ?

jpountz · June 16, 2017, 12:22pm

We do not sort all values, we only compute the top-k hits using a heap of size from + size. This heap is allocated in the JVM heap but is expected to be small.

system · July 14, 2017, 12:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.