Circuit breaker in Elasticsearch

pksinghal · December 17, 2023, 11:40am

we are running an Elasticsearch cluster with 3 nodes.
sometimes a heavy agg query comes(run manually from Kibana dev tools) and one of the nodes becomes inaccessible.
So full cluster becomes inaccessible as ES takes some time to remove that node.
so we restart that node, so that the cluster is running.

Now, we have implemented circuit breaker in Elasticsearch with
indices.breaker.total.limit = 10GB
network.breaker.inflight_requests.limit = 10GB

Now, if i run same heavy agg query, elasticsearch becomes again un-responsive with Circuit breaker exception and other indexing/search request also start to through exception
Exception: [parent] Data too large, data for [<transport_request>] would be [10856018759/10.1gb], which is larger than the limit of [10737418240/10gb], usages [request=0/0b, fielddata=6552153321/6.1gb, in_flight_requests=723/723b.

Question:
If there a way to control memory usage for a single request?
Should be lower the fielddata/query cache limit ?

configuration:
3 TB Disk
16 Core CPU
64 GB RAM
30 GB Heap to Elasticsearch
Avg heap utilization is ~27GB.

BenB196 · December 17, 2023, 12:43pm

Which version of Elasticsearch are you using?

pksinghal · December 17, 2023, 12:48pm

6.4 version

BenB196 · December 17, 2023, 3:46pm

I'd recommend upgrading, 6.4 is far past EOL and there have been significant improvements since then.

That being said, your average heap usage is pretty high 27GB/30GB = ~90% usage. It is generally recommended that average heap usage should be <70%.

Without knowing much about your cluster (and also not knowing much about the 6.x release anymore). You might want to look at adding some addition nodes to spread load/increase total cluster heap. (But you should really look at upgrading)

Christian_Dahlqvist · December 17, 2023, 3:54pm

What is the full output of the cluster stats API?

What is the use case? Are you using time-based indices?

pksinghal · December 17, 2023, 4:23pm

to decrease heap usages, should i reduced fielddata cache limit ?

pksinghal · December 17, 2023, 4:29pm

i have pageviews data, indexes are made time-bases(monthly). we have visitorId, that is unique to each user*browser. and i have ~30 milliion doc monthly.

i have implemented circuit breaker.

then i execute following query:

GET pageviews_data_m*/_search
{"size":0,"timeout":"1s","terminate_after":100000,"query":{"match_all":{"boost":1.0}},"aggregations":{"suggestions":{"terms":{"field":"visitorId","size":10,"shard_size":10,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"execution_hint":"map","order":[{"_count":"desc"},{"_key":"asc"}]}}}}

After running this query, circuit breaker exception accur, and all other requests (insert/search) also get same exception.

is there is a way to implement circuit breaker at request level?

Christian_Dahlqvist · December 17, 2023, 4:45pm

Are you updating older data or are the monthly indices effectively read-only once the new index is created? If this is the case you may be able to lower heap usage by forcemerging old, read-only indices down to a single segment.

pksinghal · December 17, 2023, 4:56pm

I am not updating data in old indexes.

also, is there is a way to implement circuit breaker at request level?

Christian_Dahlqvist · December 17, 2023, 5:01pm

I do not think there is a lot you can do in 6.4, but have not used it in years. A lot has been improved with respect to circuit breakers and heap usage in newer versions though, so I would recommend you upgrade.

Making sure you do not have a lot of very small shards and forcemerging old indices down to a single segment can help reduce the amount of heap used. Making sure your mappings are optimised is also a good step to take.

system · January 14, 2024, 5:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Old generation of the heap space stuck at 100% after request circuit breaker Elasticsearch	2	823	January 28, 2018
Circuit Breaker limit Elasticsearch	2	710	July 5, 2017
A memory intensive query crashes an elasticsearch node Elasticsearch	1	1074	July 5, 2017
Circuit breaker always trips Elasticsearch	10	4039	December 27, 2017
Elasticsearch shows Circuit Breaking Exception Elasticsearch	6	819	February 1, 2021

Circuit breaker in Elasticsearch

Related topics