Sorting memory impact on Elasticsearch

Yashasvi_Raj_Pant · November 5, 2018, 9:01am

Suppose I have to sort elasticsearch documents by a field "A"(ElasticSearch 2.4.4) as shown here:

https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-request-sort.html

i.) Which will consume more memory? High document counts of the type in the index or high cardinality(number of unique count) of the field "A"?

ii.) Which data type of "A" will consume more memory(string, long, date etc.)?

Thanks.

polyfractal · November 12, 2018, 3:40pm

The docs there probably need tweaking a bit. Starting in 2.x (iirc) doc values are enabled by default for all fields except text (analyzed string) fields: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/doc-values.html

Doc values are disk-based and consume little memory at runtime. So sorting anything except analyzed text fields will use doc values and not consume much memory.

Text fields continue to need in-memory field data for sorting, which can be expensive. In that case, smaller/lower-"cardinality" text fields will consume less memory. E.g. analyzed strings of full text will consume less memory than unique alpha-numeric identifiers.

system · December 10, 2018, 3:40pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.