8.3 Memory calculations and nested fields

ewolfman · November 2, 2022, 1:23pm

Hi,

Trying to better understanding the new memory planning for 8.3.

Do nested fields count just like other fields?

My indices have nested fields. So a single index including its nested fields has 100 fields. According to the above documentation, if I have 1000 indices this means 1000 (indices) x 100 (fields) x1kb=0.1GB heap (is this correct?)

Do nested fields count just like other fields in the index?
Just to be on the safe side: as of 8.3 the number of shards is no longer relevant for memory calculations?
Is there no importance to the amount of data itself within the index? In other words, if there are 2 indices with the exact mapping (e.g. rollover), and one index has 100M records and the other index just has 1 single record - there is no change in memory calculations?

DavidTurner · November 2, 2022, 2:02pm

I believe nested fields count like normal fields, but the simplest answer is to upgrade to 8.5 which reports the overhead size in the node stats:

There's still some amount of per-shard overhead but in most setups it's not worth considering.

Likewise, there's still some amount of per-segment overhead but it rarely matters.

ewolfman · November 2, 2022, 2:26pm

Thanks David. I can see now that the documentation section "Data nodes should have at least 1kB of heap per field per index, plus overheads" from 8.3 & 8.4 has been replaced with "Allow enough heap for field mappers and overheads".

What I am trying to do is to plan for upgrading from 7.x to latest 8.5. So I cannot run these APIs to get answers. This is problematic as I am trying to plan ahead the memory requirements of the cluster in the long run. Planning by number of expected indices, fields etc. is possible. But since this is now dropped from the documentation it is problematic to rely on what the API returns if you don't have 8.5 already set up (sorry if I am missing something).

Some questions:

If I use the same mapping on some test 8.5 cluster would the total_deduplicated_mapping_size be identical after I upgrade the production to 8.5? i.e. is this totally mapping related?
For the node stats this is more problematic as it is per node. How can I get the expected total_estimated_overhead before I upgrade to 8.5? Does this number relate in anyway to the number of indices? I reckon that the more indices the larger this overhead becomes - is this correct?

Thanks.

DavidTurner · November 2, 2022, 2:52pm

If you have enough memory in 7.x then you will be fine in 8.x. The guidance has been updated in 8.x versions because of some significant reductions in heap usage in recent versions. I would suggest doing the upgrade first without changing the size of your cluster, and once the upgrade is complete you can start to measure things and think about reducing your cluster size.

ewolfman · November 2, 2022, 2:58pm

Nevertheless, is the 8.3/8.4 documentation re. the number of indices and memory calcluations still relevant and correct in 8.5?

DavidTurner · November 2, 2022, 3:02pm

The 8.3/8.4 docs do not take account of mapping deduplication, so the memory usage in that area in 8.5 should be no worse (and will often be better).

Likewise, 8.5 still allows 1kiB per mapped field (see these docs) but this may well improve in future versions.

ewolfman · November 2, 2022, 4:10pm

Thanks!

system · November 30, 2022, 4:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Terms aggregations in docs with nested objects using a lot of memory Elasticsearch	1	398	July 6, 2017
Is there a way to know memory required Elasticsearch	4	403	July 6, 2017
Advice on the memory consumption Elasticsearch	3	349	July 6, 2017
Is there a way to know the space (disk/memory) used per field in an index? Elasticsearch	1	315	July 6, 2017
Tuning nested documents Elasticsearch	6	441	July 6, 2017

8.3 Memory calculations and nested fields

Related topics