How to Resolve In_flight_request Data too Large Errors

Hello, dear community,

We are constantly getting this OutOfMemory error with our master-1 machine.

[in_flight_requests] New used memory 4567112664 [4.2gb] for data of [<http_request>] would be larger than configured breaker: 4294967296 [4gb]

We tried to increase heap size to 5 GB etc. but it constantly occurred.

I used this API to get more details about my Circuit Breakers

GET _nodes/stats/breaker

The output(Summary):

{
    "_nodes": {
        "total": 9,
        "successful": 9,
        "failed": 0
    },
    "name": "master-3",
    "breakers": {
        "in_flight_requests": {
            "limit_size_in_bytes": 4294967296,
            "limit_size": "4gb",
            "estimated_size_in_bytes": 2754,
            "estimated_size": "2.6kb",
            "overhead": 2.0,
            "tripped": 0
        }
    },
    "name": "master-2",
    "breakers": {
        "in_flight_requests": {
            "limit_size_in_bytes": 4294967296,
            "limit_size": "4gb",
            "estimated_size_in_bytes": 2754,
            "estimated_size": "2.6kb",
            "overhead": 2.0,
            "tripped": 0
        }
    },
    "name": "master-1",
    "breakers": {
        "in_flight_requests": {
            "limit_size_in_bytes": 4294967296,
            "limit_size": "4gb",
            "estimated_size_in_bytes": 135597,
            "estimated_size": "132.4kb",
            "overhead": 2.0,
            "tripped": 872
        }
    }
}

As you see we have 3 master nodes. Master 2 and Master 3 have tripped 0 times but Master 1 has tripped 872 times.

I found this network.breaker.inflight_requests.limit setting in the Circuit Breakers document. It says it is %100 by default. I didn't see that setting in my Elasticsearch.yml file or anywhere else. Maybe we are not using it?

Here are my questions to you guys:

Our In flight request circuit breaker has tripped 872 on master-1 but it didn't trip on master-2 and master-3. Looks like we have a problem somewhere but I don't know where to look.

  1. What should I do about this situation? If we use load balancers for masters is it works?
  2. That network.breaker.inflight_requests.limit setting, can this setting fixes my OOM problems with In flight requests? If so, how can I implement this in my cluster?
  3. How should I approach these problems in general?

Thanks for reading and replying already!

If you have dedicated master nodes in the cluster, why are you sending requests to them? Dedicated master nodes should not serve traffic and be left to manage the cluster. Send the data directly to the data nodes instead.

Thanks for your answer, Christian! Will do that

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.