Elasticsearch parent breaker tripping during high indexing usage

dpitchford · March 8, 2024, 9:42pm

I am debugging an ES 7.17.3 installation that is persistently running out of memory and tripping the parent circuit breaker. This is an example error:

elasticsearch.exceptions.TransportError: TransportError(429, 'circuit_breaking_exception', '[parent] Data too large, data for [<http_request>] would be [8432214884/7.8gb], which is larger than the limit of [8160437862/7.5gb], real usage: [8432214672/7.8gb], new bytes reserved: [212/212b], usages [request=16440/16kb, fielddata=15261674/14.5mb, in_flight_requests=212/212b, model_inference=0/0b, eql_sequence=0/0b, accounting=61581360/58.7mb]')"

I've more than doubled the JVM heap size from 3 GB to 8 GB, but the memory issues are unchanged. The other surprising thing to me is that these error occur during periods of heavy indexing load, but they always seem to be triggered by a search call to ES, not an index call.

For debugging here is the output from the _cat/nodes endpoint:

name                   id   node.role   heap.current heap.percent heap.max
elasticsearch-master-0 uLpr cdfhilmrstw        5.6gb           70      8gb

And the node breaker stats:

{
  "uLprTtGlRWq-L2mi-mNeFg": {
    "timestamp": 1709934029351,
    "name": "elasticsearch-master-0",
    "transport_address": "10.1.1.148:9300",
    "host": "10.1.1.148",
    "ip": "10.1.1.148:9300",
    "roles": [
      "data",
      "data_cold",
      "data_content",
      "data_frozen",
      "data_hot",
      "data_warm",
      "ingest",
      "master",
      "ml",
      "remote_cluster_client",
      "transform"
    ],
    "attributes": {
      "ml.machine_memory": "17179869184",
      "xpack.installed": "true",
      "transform.node": "true",
      "ml.max_open_jobs": "512",
      "ml.max_jvm_size": "8589934592"
    },
    "breakers": {
      "request": {
        "limit_size_in_bytes": 5153960755,
        "limit_size": "4.7gb",
        "estimated_size_in_bytes": 0,
        "estimated_size": "0b",
        "overhead": 1,
        "tripped": 0
      },
      "fielddata": {
        "limit_size_in_bytes": 3435973836,
        "limit_size": "3.1gb",
        "estimated_size_in_bytes": 0,
        "estimated_size": "0b",
        "overhead": 1.03,
        "tripped": 0
      },
      "in_flight_requests": {
        "limit_size_in_bytes": 8589934592,
        "limit_size": "8gb",
        "estimated_size_in_bytes": 0,
        "estimated_size": "0b",
        "overhead": 2,
        "tripped": 0
      },
      "model_inference": {
        "limit_size_in_bytes": 4294967296,
        "limit_size": "4gb",
        "estimated_size_in_bytes": 0,
        "estimated_size": "0b",
        "overhead": 1,
        "tripped": 0
      },
      "eql_sequence": {
        "limit_size_in_bytes": 4294967296,
        "limit_size": "4gb",
        "estimated_size_in_bytes": 0,
        "estimated_size": "0b",
        "overhead": 1,
        "tripped": 0
      },
      "accounting": {
        "limit_size_in_bytes": 8589934592,
        "limit_size": "8gb",
        "estimated_size_in_bytes": 54473076,
        "estimated_size": "51.9mb",
        "overhead": 1,
        "tripped": 0
      },
      "parent": {
        "limit_size_in_bytes": 8160437862,
        "limit_size": "7.5gb",
        "estimated_size_in_bytes": 5529302008,
        "estimated_size": "5.1gb",
        "overhead": 1,
        "tripped": 3186
      }
    }
  }
}

I'm trying to piece together what is happening and how to resolve it. Is the garbage collector struggling to keep up? Is there a remediation besides still more memory?

Christian_Dahlqvist · March 24, 2024, 8:44am

What is the average size of your documents? What bulk size are you using? How many concurrent indexing threads are you using? What is the specification of your cluster in terms of nodes and hardware? What type of storage are you using? Local SSDs?

system · April 21, 2024, 8:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CircuitBreaker: [parent] Data too large, data for [<transport_request>] Elasticsearch	2	1662	September 5, 2019
Bulk indexing stalls (7.2.0) Elasticsearch	11	2159	July 25, 2019
Data too large, data for [<transport_request>] would be [10554893106/9.8gb], which is larger than the limit of [10092838912/9.3gb], real usage: [10523239224/9.8gb], new bytes reserved: [31653882/30.1mb]]]] Elasticsearch	17	16486	September 2, 2019
Many indices.fielddata.breaker errors in logs and cluster slow Elasticsearch	3	453	July 6, 2017
Circuit_breaking_exception on read/write Elasticsearch	1	1833	February 14, 2017

Elasticsearch parent breaker tripping during high indexing usage

Related topics