CircuitBreakingException [parent] Data too large in 7.4.2

engstrom · April 21, 2020, 3:28pm

I'm hoping someone can help me understand what is causing this exception. This is being thrown frequently, both while writing to and reading from the cluster.

Here's an example error I received while running GET /_cat/indices?v in Kibana:

{
  "error": {
    "root_cause": [
      {
        "type": "circuit_breaking_exception",
        "reason": "[parent] Data too large, data for [<http_request>] would be [4075745992/3.7gb], which is larger than the limit of [4063657984/3.7gb], real usage: [4075745992/3.7gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=8989/8.7kb, in_flight_requests=0/0b, accounting=1803266/1.7mb]",
        "bytes_wanted": 4075745992,
        "bytes_limit": 4063657984,
        "durability": "PERMANENT"
      }
    ],
    "type": "circuit_breaking_exception",
    "reason": "[parent] Data too large, data for [<http_request>] would be [4075745992/3.7gb], which is larger than the limit of [4063657984/3.7gb], real usage: [4075745992/3.7gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=8989/8.7kb, in_flight_requests=0/0b, accounting=1803266/1.7mb]",
    "bytes_wanted": 4075745992,
    "bytes_limit": 4063657984,
    "durability": "PERMANENT"
  },
  "status": 429
}

My cluster has 3 master nodes and 3 data nodes. The cluster has 2 indexes and each index has 2 shards (with replica count set to 1).

From what I've read, this error indicates that I've reached 95% heap usage on at least one node. But when I add up the usage from the circuit breakers (request + fielddata + in_flight_requests + accounting), they never total more than ~20mb. So something else must be responsible for the memory usage.

I noticed that the cluster was in yellow status which I narrowed down to an allocation failure assigning replicas to the units2 index. I removed the replicas which caused the cluster status to return to green and I stopped seeing errors for a while. This made me think that replication was using too much memory and causing the issue. To test this theory, I let the cluster run overnight without the replicas. Unfortunately this morning I found thousands of new CircuitBreakingExceptions.

I'm not sure what to look at next and would appreciate any assistance you can provide. For some context, I've run several commands this morning to looks for clues. I've copied the output of those commands below.

Here's the output of the GET /_cat/indices?v command:

health status index     uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   units2    AHcurH6cTASFSj4AF1q7rQ   2   0   14147313      1107187      1.8gb          1.8gb
green  open   .kibana_2 _Uo50jEPQOK9iBGbh9zw4w   1   1                              3.7kb               
green  open   places2   L0T_uvxZR8maIVwO3d44hw   2   1       6356         2570     45.7mb         19.6mb
green  open   .kibana_1 OousjPfkSHeySiLefOdGOw   1   1                               283b

The output of GET /_cat/shards?v:

index     shard prirep state       docs  store ip            node
.kibana_2 0     p      STARTED        1  3.7kb x.x.x.x 451a15942b572d7159f0736533a7533b
.kibana_2 0     r      STARTED        1  3.7kb x.x.x.x 7bcda2e106963bc7c4099a16d057b265
.kibana_1 0     r      STARTED        0   283b x.x.x.x 66c9c69b225cb26bb1988e6427d529a8
.kibana_1 0     p      STARTED        0   283b x.x.x.x 451a15942b572d7159f0736533a7533b
places2   1     r      STARTED     6405 12.8mb x.x.x.x 66c9c69b225cb26bb1988e6427d529a8
places2   1     p      STARTED     6405 13.3mb x.x.x.x 451a15942b572d7159f0736533a7533b
places2   0     p      STARTED     6356 19.6mb x.x.x.x 66c9c69b225cb26bb1988e6427d529a8
places2   0     r      STARTED     6356 13.3mb x.x.x.x 7bcda2e106963bc7c4099a16d057b265
units2    1     p      STARTED 14353629  2.1gb x.x.x.x 451a15942b572d7159f0736533a7533b
units2    0     p      STARTED 14147313  1.8gb x.x.x.x 7bcda2e106963bc7c4099a16d057b265

The output of GET /_cluster/stats:

{
  "_nodes" : {
    "total" : 6,
    "successful" : 6,
    "failed" : 0
  },
  "cluster_name" : "843863714247:search-00",
  "cluster_uuid" : "z-d6Y0FwRXikLrOSwBlPxg",
  "timestamp" : 1587481456218,
  "status" : "green",
  "indices" : {
    "count" : 4,
    "shards" : {
      "total" : 10,
      "primaries" : 6,
      "replication" : 0.6666666666666666,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 4,
          "avg" : 2.5
        },
        "primaries" : {
          "min" : 1,
          "max" : 2,
          "avg" : 1.5
        },
        "replication" : {
          "min" : 0.0,
          "max" : 1.0,
          "avg" : 0.75
        }
      }
    },
    "docs" : {
      "count" : 28513707,
      "deleted" : 3754420
    },
    "store" : {
      "size_in_bytes" : 4137770567
    },
    "fielddata" : {
      "memory_size_in_bytes" : 18752,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 1109584,
      "total_count" : 3410091,
      "hit_count" : 613985,
      "miss_count" : 2796106,
      "cache_size" : 65,
      "cache_count" : 36427,
      "evictions" : 36362
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 63,
      "memory_in_bytes" : 4344431,
      "terms_memory_in_bytes" : 1655163,
      "stored_fields_memory_in_bytes" : 1085760,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 128576,
      "points_memory_in_bytes" : 867196,
      "doc_values_memory_in_bytes" : 607736,
      "index_writer_memory_in_bytes" : 0,
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set_memory_in_bytes" : 102640,
      "max_unsafe_auto_id_timestamp" : -1,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 6,
      "coordinating_only" : 0,
      "data" : 3,
      "ingest" : 3,
      "master" : 3
    },
    "versions" : [ "7.4.2" ],
    "os" : {
      "available_processors" : 12,
      "allocated_processors" : 12,
      "names" : [ {
        "count" : 6
      } ],
      "pretty_names" : [ {
        "count" : 6
      } ],
      "mem" : {
        "total_in_bytes" : 35828772864,
        "free_in_bytes" : 4621955072,
        "used_in_bytes" : 31206817792,
        "free_percent" : 13,
        "used_percent" : 87
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 106
      },
      "open_file_descriptors" : {
        "min" : 1403,
        "max" : 1506,
        "avg" : 1445
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 2994403385,
      "mem" : {
        "heap_used_in_bytes" : 13192197672,
        "heap_max_in_bytes" : 19222757376
      },
      "threads" : 759
    },
    "fs" : {
      "total_in_bytes" : 656313581568,
      "free_in_bytes" : 644325363712,
      "available_in_bytes" : 644224700416
    },
    "network_types" : {
      "transport_types" : {
        "com.amazon.opendistroforelasticsearch.security.ssl.http.netty.OpenDistroSecuritySSLNettyTransport" : 6
      },
      "http_types" : {
        "filter-jetty" : 6
      }
    },
    "discovery_types" : {
      "zen" : 6
    },
    "packaging_types" : [ {
      "flavor" : "oss",
      "type" : "tar",
      "count" : 6
    } ]
  }
}

engstrom · April 21, 2020, 3:29pm

And the output of GET /_nodes/stats/breaker:

{
  "_nodes" : {
    "total" : 6,
    "successful" : 6,
    "failed" : 0
  },
  "cluster_name" : "843863714247:search-00",
  "nodes" : {
    "y4FWob1iTHmPJ3bxxmpCIA" : {
      "timestamp" : 1587482913298,
      "name" : "7bcda2e106963bc7c4099a16d057b265",
      "roles" : [ "ingest", "data" ],
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 2566520832,
          "limit_size" : "2.3gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 1711013888,
          "limit_size" : "1.5gb",
          "estimated_size_in_bytes" : 9952,
          "estimated_size" : "9.7kb",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "in_flight_requests" : {
          "limit_size_in_bytes" : 4277534720,
          "limit_size" : "3.9gb",
          "estimated_size_in_bytes" : 1420,
          "estimated_size" : "1.3kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "accounting" : {
          "limit_size_in_bytes" : 4277534720,
          "limit_size" : "3.9gb",
          "estimated_size_in_bytes" : 1990055,
          "estimated_size" : "1.8mb",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 4063657984,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 3982582080,
          "estimated_size" : "3.7gb",
          "overhead" : 1.0,
          "tripped" : 16374
        }
      }
    },
    "EVTZfay_Tpm8x1BAMs5Rww" : {
      "timestamp" : 1587482913295,
      "name" : "ad06ce6e0185ee6225f94c41b500cef7",
      "roles" : [ "master" ],
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 1278030643,
          "limit_size" : "1.1gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 852020428,
          "limit_size" : "812.5mb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "in_flight_requests" : {
          "limit_size_in_bytes" : 2130051072,
          "limit_size" : "1.9gb",
          "estimated_size_in_bytes" : 1420,
          "estimated_size" : "1.3kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "accounting" : {
          "limit_size_in_bytes" : 2130051072,
          "limit_size" : "1.9gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 2023548518,
          "limit_size" : "1.8gb",
          "estimated_size_in_bytes" : 315615624,
          "estimated_size" : "300.9mb",
          "overhead" : 1.0,
          "tripped" : 0
        }
      }
    },
    "p3F4hA65SA6zTFn3DX3C1Q" : {
      "timestamp" : 1587482913295,
      "name" : "61b609648852523008e91f0abb928e08",
      "roles" : [ "master" ],
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 1278030643,
          "limit_size" : "1.1gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 852020428,
          "limit_size" : "812.5mb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "in_flight_requests" : {
          "limit_size_in_bytes" : 2130051072,
          "limit_size" : "1.9gb",
          "estimated_size_in_bytes" : 1420,
          "estimated_size" : "1.3kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "accounting" : {
          "limit_size_in_bytes" : 2130051072,
          "limit_size" : "1.9gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 2023548518,
          "limit_size" : "1.8gb",
          "estimated_size_in_bytes" : 408435048,
          "estimated_size" : "389.5mb",
          "overhead" : 1.0,
          "tripped" : 0
        }
      }
    },
    "g40rmLcPROiMTaYyctxZUQ" : {
      "timestamp" : 1587482913295,
      "name" : "73041e4d0411a847d61efda09005db2c",
      "roles" : [ "master" ],
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 1278030643,
          "limit_size" : "1.1gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 852020428,
          "limit_size" : "812.5mb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "in_flight_requests" : {
          "limit_size_in_bytes" : 2130051072,
          "limit_size" : "1.9gb",
          "estimated_size_in_bytes" : 1420,
          "estimated_size" : "1.3kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "accounting" : {
          "limit_size_in_bytes" : 2130051072,
          "limit_size" : "1.9gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 2023548518,
          "limit_size" : "1.8gb",
          "estimated_size_in_bytes" : 378401112,
          "estimated_size" : "360.8mb",
          "overhead" : 1.0,
          "tripped" : 0
        }
      }
    },
    "mWawkbG0QoyC177ReT-o_w" : {
      "timestamp" : 1587482913296,
      "name" : "451a15942b572d7159f0736533a7533b",
      "roles" : [ "ingest", "data" ],
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 2566520832,
          "limit_size" : "2.3gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 1711013888,
          "limit_size" : "1.5gb",
          "estimated_size_in_bytes" : 10104,
          "estimated_size" : "9.8kb",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "in_flight_requests" : {
          "limit_size_in_bytes" : 4277534720,
          "limit_size" : "3.9gb",
          "estimated_size_in_bytes" : 1420,
          "estimated_size" : "1.3kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "accounting" : {
          "limit_size_in_bytes" : 4277534720,
          "limit_size" : "3.9gb",
          "estimated_size_in_bytes" : 2084787,
          "estimated_size" : "1.9mb",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 4063657984,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 3930358832,
          "estimated_size" : "3.6gb",
          "overhead" : 1.0,
          "tripped" : 15345
        }
      }
    },
    "wFKrV_BYREeJuNqPQzyIBQ" : {
      "timestamp" : 1587482913295,
      "name" : "66c9c69b225cb26bb1988e6427d529a8",
      "roles" : [ "ingest", "data" ],
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 2566520832,
          "limit_size" : "2.3gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 1711013888,
          "limit_size" : "1.5gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "in_flight_requests" : {
          "limit_size_in_bytes" : 4277534720,
          "limit_size" : "3.9gb",
          "estimated_size_in_bytes" : 82200,
          "estimated_size" : "80.2kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "accounting" : {
          "limit_size_in_bytes" : 4277534720,
          "limit_size" : "3.9gb",
          "estimated_size_in_bytes" : 444065,
          "estimated_size" : "433.6kb",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 4063657984,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 3975447168,
          "estimated_size" : "3.7gb",
          "overhead" : 1.0,
          "tripped" : 599
        }
      }
    }
  }
}

system · May 19, 2020, 3:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error : circuit_breaking_exception Elasticsearch	2	281	July 1, 2021
Finding out cause of circuit breaking exception (Data too large) Elasticsearch	5	5232	December 4, 2024
CircuitBreakingException: [parent] Data too large IN ES 7.3.2 Elasticsearch	5	299	April 26, 2023
CircuitBreakingException: [parent] Data too large Elasticsearch elastic-stack-monitoring	7	496	January 8, 2023
org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] Elasticsearch elastic-stack-monitoring	3	915	June 11, 2021

CircuitBreakingException [parent] Data too large in 7.4.2

Related topics