I am new to ELK and tried load testing for the first time, I did some heavy load testing in ELK and tried to create a report for last three months .When I ran the command . I can see shards started failing and all the elasticsearch nodes and logstash services moved to stopped state. In the logstash logs , I got below error
[2020-07-24T12:31:50,219][INFO ][logstash.outputs.elasticsearch][nir-esim-gdsp_pipeline][c042dc0baedb208c3ba6bede824f0d8ed0fa8c3a85c5914726e4dfce3f7315bb] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}
[2020-07-24T12:31:56,732][INFO ][logstash.outputs.elasticsearch][first_pipeline][c042dc0baedb208c3ba6bede824f0d8ed0fa8c3a85c5914726e4dfce3f7315bb] retrying failed action with response code: 429 ({"type"=>"circuit_breaking_exception", "reason"=>"[parent] Data too large, data for [<transport_request>] would be [1050524440/1001.8mb], which is larger than the limit of [1020054732/972.7mb], real usage: [1050521096/1001.8mb], new bytes reserved: [3344/3.2kb], usages [request=24208/23.6kb, fielddata=38134/37.2kb, in_flight_requests=67096/65.5kb, accounting=10372352/9.8mb]", "bytes_wanted"=>1050524440, "bytes_limit"=>1020054732, "durability"=>"PERMANENT"})
Can I do some settings that instead of elasticsearch going down, it can just reject or give timeout error in kibana. As ELK cluster going down will pile up all the logs in source system
Hey @rohitarorait82! Sorry about the delayed response. Just talked with our easticsearch team... Ordering terms aggs by cardinality is just a really expensive query so you're prone to run into issues like this.
Some other things I learned:
The circuit breaker exception should be non-fatal so there might be something else going on there if your nodes are really going down (however it can be fatal sometimes)
Another thing, the circuit breaker exception gets tripped just when the overall memory is over a certain threshold so there might be something else chewing through some of your available memory, not just this query (though this is an expensive query)
A composite agg should be more efficient if you're trying to see a lot of results but that might affect your logstash setup
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.