[2019-10-07T07:40:55,341][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"circuit_breaking_exception", "reason"=>"[parent] Data too large, data for [<transport_request>] would be [16414928216/15.2gb], which is larger than the limit of [16320875724/15.1gb], real usage: [16414925088/15.2gb], new bytes reserved: [3128/3kb]", "bytes_wanted"=>16414928216, "bytes_limit"=>16320875724, "durability"=>"TRANSIENT"})
I know you have questions on this but I don't understand it clearly in the other posts.
Shards / hot-data node: ~300 (Max is 400+ during ILM)
Shards / warm-data node: ~600 - 700 (Max is 800+ during ILM) (Indices are read-only)
Shards / cold-data node: ~1100 - 1200 (Max is 1300+ during ILM) (Indices are read-only and frozen)
From the error above, your suggestion is to increase the heap size in the other posts.
However, I'm not sure what nodes that I should increase the heap size; master, hot-data (ingest)???
@worapojc Very recently I had circuit breaker exceptions that were caused by the heap setting in jvm.option on our ingest nodes. Those only had 4GB RAM. Increased to 12GB and didn't see any circuit breakers errors since.
@worapojc Could it be that you are sending those requests to the master node(s) given that the circuit breaker exception contains a reference to ~16GB of heap?
I think removing the hosts of the master nodes from your logstash configuration and sending requests to your hot nodes might help here.
In case it doesn't feel free to share your jvm.options so I can take a look and see if something can be optimized there.
@worapojc Elasticsearch never returns a 504 from its APIs. The issue must be coming from something (HTTP proxy of some sort or Kibana) between the client and ES. You shouldn't see those errors when directly calling the ES REST API.
@willemdh, This is because ES maintains some data structures in your heap permanently, which is very well related to the amount of data you have indexed. We have played a lot around this, and the only solution we got is firstly following ES suggestions for tuning indexed data. Secondly, design your system to scale horizontally so that each data node holds lower amount of data, which in turn helps consuming less heap.
Heap size can be different for each use case. Increasing RAM after a certain amount is not a solution, the more amount of RAM you give for system to play with, that faster is your query response.
It's hard to tell what causes this and there's multiple possible causes. I would try and look into whether one or multiple of your nodes are abnormally slow for some reason (e.g. they could be swapping which would probably show as very long GC times and warnings about those in their logs).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.