Circuit breaker exception resulted in temination of all the nodes of Elasticsearch

rohitarorait82 · July 27, 2020, 4:54pm

Hi All,

I am new to ELK and tried load testing for the first time, I did some heavy load testing in ELK and tried to create a report for last three months .When I ran the command . I can see shards started failing and all the elasticsearch nodes and logstash services moved to stopped state. In the logstash logs , I got below error

[2020-07-24T12:31:50,219][INFO ][logstash.outputs.elasticsearch][nir-esim-gdsp_pipeline][c042dc0baedb208c3ba6bede824f0d8ed0fa8c3a85c5914726e4dfce3f7315bb] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}
[2020-07-24T12:31:56,732][INFO ][logstash.outputs.elasticsearch][first_pipeline][c042dc0baedb208c3ba6bede824f0d8ed0fa8c3a85c5914726e4dfce3f7315bb] retrying failed action with response code: 429 ({"type"=>"circuit_breaking_exception", "reason"=>"[parent] Data too large, data for [<transport_request>] would be [1050524440/1001.8mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1050521096/1001.8mb], new bytes reserved: [3344/3.2kb], usages [request=24208/23.6kb, fielddata=38134/37.2kb, in_flight_requests=67096/65.5kb, accounting=10372352/9.8mb]", "bytes_wanted"=>1050524440, "bytes_limit"=>1020054732, "durability"=>"PERMANENT"})

Can I do some settings that instead of elasticsearch going down, it can just reject or give timeout error in kibana. As ELK cluster going down will pile up all the logs in source system

myasonik · July 27, 2020, 7:05pm

Hey @rohitarorait82!

Unfortunately, there's no way to swallow these errors and keep ES up and running while it's overloaded.

I'd recommend reading this blog post about the issue and trying to tune your ES cluster to be a better fit for the data you have.

rohitarorait82 · July 30, 2020, 7:10am

Thanks @myasonik for you reply.

I am just trying to run below query in ELK for very huge data. Is there a way to find out maximum time range which I can use in this query.

GET /my_index/_search
{
"size" :0,
"aggs": {
"2": {
"terms": {
"field": "API.keyword",
"order": {
"1": "desc"
},
"size": 500
},
"aggs": {
"1": {
"cardinality": {
"field": "correl.keyword"
}
},
"3": {
"terms": {
"field": "Consumer.keyword",
"order": {
"1": "desc"
},
"size": 50
},
"aggs": {
"1": {
"cardinality": {
"field": "correl.keyword"
}
}
}
}
}
}
},
"query": {
"bool": {
"filter": [
{
"range": {
"@timestamp": {
"gte": "2020-07-30T05:49:39.444Z",
"lte": "2020-07-30T05:50:39.444Z",
"format": "strict_date_optional_time"
}
}
}
]
}
}
}

myasonik · August 4, 2020, 6:46pm

Hey @rohitarorait82! Sorry about the delayed response. Just talked with our easticsearch team... Ordering terms aggs by cardinality is just a really expensive query so you're prone to run into issues like this.

Some other things I learned:

The circuit breaker exception should be non-fatal so there might be something else going on there if your nodes are really going down (however it can be fatal sometimes)
Another thing, the circuit breaker exception gets tripped just when the overall memory is over a certain threshold so there might be something else chewing through some of your available memory, not just this query (though this is an expensive query)
A composite agg should be more efficient if you're trying to see a lot of results but that might affect your logstash setup

rohitarorait82 · August 6, 2020, 6:07am

@myasonik Thanks a lot , I will check and try to implement all these suggestions

system · September 3, 2020, 6:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Circuit break exception - version 7.7.0 Elasticsearch	4	479	October 11, 2021
Logstash circuit breaking Logstash	3	308	October 25, 2022
Circuit breaker on high throughput Elasticsearch	4	451	July 24, 2023
Error code 429 - circuit_breaking_exception Elasticsearch	10	6784	November 8, 2019
Kibana CircuitBreakingException Elasticsearch	2	780	February 19, 2020

Circuit breaker exception resulted in temination of all the nodes of Elasticsearch

Related topics