Indices.fielddata.cache.size and indices.breaker.fielddata.limit

Hi all,

I spent some time reading the https://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html, but I still found the concept confusing. I come up with some intepretation/statements, and I hope you can confirm if they are correct.

First of all, by default indices.fielddata.cache.size is unbound or not set - no fielddata eviction , which means that a query with values loaded into fielddata bigger than HEAP size will cause OOM exception and lead to node death.

So to prevent OOM that kills an ES node, we may set indices.fielddata.cache.size to a value such as 70%. "With this setting in place, the least recently used fielddata will be evicted to make space for newly loaded data." So as fielddata is full, it will evict old values to give place for new ones requested by a query.

Yet there is another problem with using only indices.fielddata.cache.size because fielddata size is checked after the data is loaded. In other words, if a new query comes in with values loaded into fielddata larger than 70%, says 100%, OMM still occurs even if fielddata evicts all old values.

Again, to solve this problem, ES use indices.breaker.fielddata.limit. It is recommended to set indices.breaker.fielddata.limit higher than indices.fielddata.cache.size, so indices.breaker.fielddata.limit serves as a safe net to prevent OOM.

If we set indices.breaker.fielddata.limit to 80%, circuit breaker check the value size before loading, so if values reach 80%, above the 70% that fielddata.cache.size can handle, the query will be aborted with an exception, but no OOM.

In summary:

  • We should set indices.fielddata.cache.size to a certain value (lower than indices.breaker.fielddata.limit) to allow us to run queries accesing new value when fielddata is full
  • The default indices.breaker.fielddata.limit is a safe net to prevent OOM.

Are the interpretations above correct?

Thanks,
Anh