Inserting data to Elasticsearch - Retrying failed action with response code: 429 - circuit_breaking_exception - Data too large

I try to insert data to Elasticsearch both using Logstash and Bulk API from Helpers in Python. Do you have any ideas why I get the error shown below? I tried to change indices.breaker.request.limit to 80% and I have set LS_JAVA_OPTS="-Xmx30g -Xms30g" in Elasticsearch and Logstash however it doesn't help. Any ideas how to solve this problem?

[2019-06-03T12:41:52,404][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"circuit_breaking_exception", "reason"=>"[parent] Data too large, data for [<transport_request>] would be [1031960606/984.1mb], which is larger than the limit of [986061209/940.3mb], real usage: [1031042392/983.2mb], new bytes reserved: [918214/896.6kb]", "bytes_wanted"=>1031960606, "bytes_limit"=>986061209, "durability"=>"PERMANENT"})

What is the size of your documents? What bulk size are you using? How many indices and shards are you actively indexing into? How many concurrent indexing processes/threads do you have?

The size of one metric is approximately 250 bytes. Both in Logstash and in Python I tried with batch size equals to 1000 and I run 40 concurrent processes. Both in the Logstash and in the client written in Python, I tried to send 1000 metrics using 40 parallel running processes. At the moment I have 4 nodes with ES. In all of them I have set the following parameters master: true, data: true, ingest: true. When I run inserting data to the database on the first server everything works fine. However, when I run 40 inserting processes on the second server, I get an error shown in the question. Do you need any additional information?

How many indices and shards are you actively indexing into? If you gradually increase the level of concurrency, how far do you get before you start setting errors?

I add all the data to the /logs/log. In Kibana monitoring Elasticsearch overview has 56 total shards. Is this what you mean?

Are you indexing into all of these?

I would recommend reading this blog post, which discusses why rejections like the ones you are seeing occur.

How much data does each node hold? What is the heap size?

I do not receive this message anymore, however, I still have a problem with indexing performance. As I mentioned, I have a cluster of four nodes. Each node has set master: true, data: true, ingest: true. To insert data I use Bulk API from Helpers collection in Python. I run 40 parallel processes which send batches of 1000 metrics. If I run these processes only on the first node and I insert data also to the first node I get a performance of 120k entries per second. When I run the same on the second node and insert data to the second node, the performance of the whole cluster instead of 240k is decreased to around 100. Any ideas how to solve this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.