CircuitBreaking Exception

I am using ELK cluster(3 master+3 data +2 coordinate) + kafka cluster.I am also using winlogbeat and packetbeat at client side.In kibana discover i saw that winlogbeat data was coming slow and packetbeat data was coming at fine speed.

elasticsearch.log

Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for 
[indices:data/write/bulk[s]] would be [7294326534/6.7gb], which is larger than the limit of [7140383129/6.6gb],
 real usage: [7294322264/6.7gb], new bytes reserved: [4270/4.1kb], usages [request=0/0b, fielddata=2988976421/2.7gb,
 in_flight_requests=336714/328.8kb, model_inference=0/0b, accounting=140195792/133.7mb]

[2020-12-14T22:08:51,435][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [ed2] GC did bring memory usage down, before [8160747120], after [7818122856], allocations [29], duration [41]
[2020-12-14T22:08:56,158][WARN ][o.e.m.j.JvmGcMonitorService] [ed2] [gc][2333] overhead, spent [1.4s] collecting in the last [2s]
[2020-12-14T22:09:12,488][WARN ][o.e.i.b.fielddata        ] [ed2] [fielddata] New used memory 3439741693 [3.2gb] for data of [_id] would be larger than configured breaker: 3435973836 [3.1gb], breaking

in logstash logs

[2020-12-14T14:34:36,615][INFO ][logstash.outputs.elasticsearch][main][fd9722125a07fed1dad02c5502ff74ecc820099b27c3bc7d15e5e3dd5e398db4] retrying failed action with response code: 429 ({"type"=>"circuit_breaking_exception", "reason"=>"[parent] Data too large, data for [indices:data/write/bulk[s]] would be [8359714998/7.7gb], which is larger than the limit of [8160437862/7.5gb], real usage: [8359534616/7.7gb], new bytes reserved: [180382/176.1kb], usages [request=0/0b, fielddata=2527991541/2.3gb, in_flight_requests=229140/223.7kb, model_inference=0/0b, accounting=157021942/149.7mb]", "bytes_wanted"=>8359714998, "bytes_limit"=>8160437862, "durability"=>"PERMANENT"})

i am using jdk-1.11

What does your Logstash output section to Elasticsearch look like?

output { 
elasticsearch {
    hosts => ["http://ec1:9200","http://ec2:9200"]
    manage_template => false
    index => "log-pb-%{+YYYY.MM.dd}"
    user => "user"
    password => "password"
  }
}

Which version of Elasticsearch are you using? What is the full output of the cluster stats API?

7.9
GET /_cluster/stats

{
  "_nodes" : {
    "total" : 8,
    "successful" : 8,
    "failed" : 0
  },
  "cluster_name" : "HELK",
  "cluster_uuid" : "5X0wtmdxTEGMTs4GF9ZLvA",
  "timestamp" : 1608026169200,
  "status" : "yellow",
  "indices" : {
    "count" : 261,
    "shards" : {
      "total" : 1428,
      "primaries" : 715,
      "replication" : 0.9972027972027973,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 6,
          "avg" : 5.471264367816092
        },
        "primaries" : {
          "min" : 1,
          "max" : 3,
          "avg" : 2.739463601532567
        },
        "replication" : {
          "min" : 0.0,
          "max" : 1.0,
          "avg" : 0.994891443167305
        }
      }
    },
    "docs" : {
      "count" : 433315046,
      "deleted" : 9458997
    },
    "store" : {
      "size_in_bytes" : 475388321515,
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size_in_bytes" : 9311911952,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 31742295,
      "total_count" : 11885028,
      "hit_count" : 2691923,
      "miss_count" : 9193105,
      "cache_size" : 112387,
      "cache_count" : 159299,
      "evictions" : 46912
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 12072,
      "memory_in_bytes" : 436154732,
      "terms_memory_in_bytes" : 342747184,
      "stored_fields_memory_in_bytes" : 6300496,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 47904320,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 39202732,
      "index_writer_memory_in_bytes" : 367692584,
      "version_map_memory_in_bytes" : 2399193,
      "fixed_bit_set_memory_in_bytes" : 9635584,
      "max_unsafe_auto_id_timestamp" : 1608010220337,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "binary",
          "count" : 14,
          "index_count" : 3
        },
        {
          "name" : "boolean",
          "count" : 414,
          "index_count" : 151
        },
        {
          "name" : "byte",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "date",
          "count" : 1599,
          "index_count" : 260
        },
        {
          "name" : "date_nanos",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "date_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "double",
          "count" : 8,
          "index_count" : 8
        },
        {
          "name" : "double_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "flattened",
          "count" : 9,
          "index_count" : 1
        },
        {
          "name" : "float",
          "count" : 177,
          "index_count" : 34
        },
        {
          "name" : "float_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "geo_point",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "geo_shape",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "half_float",
          "count" : 78,
          "index_count" : 22
        },
        {
          "name" : "integer",
          "count" : 200,
          "index_count" : 18
        },
        {
          "name" : "integer_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "ip",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "ip_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "keyword",
          "count" : 20589,
          "index_count" : 258
        },
        {
          "name" : "long",
          "count" : 4445,
          "index_count" : 251
        },
        {
          "name" : "long_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "nested",
          "count" : 62,
          "index_count" : 17
        },
        {
          "name" : "object",
          "count" : 4680,
          "index_count" : 257
        },
        {
          "name" : "shape",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "short",
          "count" : 22,
          "index_count" : 8
        },
        {
          "name" : "text",
          "count" : 19479,
          "index_count" : 238
        }
      ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [
        {
          "name" : "pattern_capture",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "analyzer_types" : [
        {
          "name" : "custom",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [
        {
          "name" : "uax_url_email",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_filters" : [
        {
          "name" : "lowercase",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "unique",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_analyzers" : [ ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 8,
      "coordinating_only" : 2,
      "data" : 3,
      "ingest" : 0,
      "master" : 3,
      "ml" : 0,
      "remote_cluster_client" : 0,
      "transform" : 0,
      "voting_only" : 0
    },
    "versions" : [
      "7.9.3",
      "7.10.0"
    ],
    "os" : {
      "available_processors" : 32,
      "allocated_processors" : 32,
      "names" : [
        {
          "name" : "Linux",
          "count" : 8
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 8 (Core)",
          "count" : 8
        }
      ],
      "mem" : {
        "total_in_bytes" : 132800380928,
        "free_in_bytes" : 43966423040,
        "used_in_bytes" : 88833957888,
        "free_percent" : 33,
        "used_percent" : 67
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 7
      },
      "open_file_descriptors" : {
        "min" : 424,
        "max" : 4637,
        "avg" : 1946
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 2827082426,
      "versions" : [
        {
          "version" : "15",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15+36-1562",
          "vm_vendor" : "Oracle Corporation",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 5
        },
        {
          "version" : "15.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15.0.1+9",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 3
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 25820404688,
        "heap_max_in_bytes" : 56908316672
      },
      "threads" : 623
    },
    "fs" : {
      "total_in_bytes" : 1656356163584,
      "free_in_bytes" : 1119258292224,
      "available_in_bytes" : 1119258292224
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "security4" : 8
      },
      "http_types" : {
        "security4" : 8
      }
    },
    "discovery_types" : {
      "zen" : 8
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "rpm",
        "count" : 8
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 0,
      "processor_stats" : { }
    }
  }
}

All nodes should be the same version, please upgrade the older versions ASAP.

Same with the JVM.

will it be safe to update i mean data should not lost, it was working fine since 20 days now it is working but not as earlier

You should never run that long with a mixed version cluster, you are going to have issues.

I have upgraded java-1.8.0 to java-11 in all nodes of elasticsearch and logstash but GET /_cluster/stats give me the same result of jvm

 "jvm" : {
      "max_uptime_in_millis" : 4291087,
      "versions" : [
        {
          "version" : "15.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15.0.1+9",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 3
        },
        {
          "version" : "15",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15+36-1562",
          "vm_vendor" : "Oracle Corporation",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 5
        }
      ],

i am upgrading elastic 7.9 to 7.10 but first i updated the java

Did you restart the instance(s)?

yes i restart all the elasticsearch nodes
below is my jvm.options file of elasticsearch data node


## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms8g
-Xmx8g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly

## G1GC Configuration
# NOTE: G1 GC is only supported on JDK version 10 or later
# to use G1GC, uncomment the next two lines and update the version on the
# following three lines to your version of the JDK
# 10-13:-XX:-UseConcMarkSweepGC
# 10-13:-XX:-UseCMSInitiatingOccupancyOnly
14-:-XX:+UseG1GC
14-:-XX:G1ReservePercent=25
14-:-XX:InitiatingHeapOccupancyPercent=30
11-:-XX:G1ReservePercent=25
11-:-XX:InitiatingHeapOccupancyPercent=30
## JVM temporary directory
-Djava.io.tmpdir=${ES_TMPDIR}

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=/home/elasticsearch

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log

## JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/var/log/elasticsearch/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m

In master node,coordinate node i have set heap size to 7g and in data node 8g

do i need to install jdk for kibana also

Hey now i worked on it and now it is showing me

 "versions" : [
      "7.10.1"
    ],
"jvm" : {
      "max_uptime_in_millis" : 356356,
      "versions" : [
        {
          "version" : "15.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15.0.1+9",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 8
        }
      ],

1 Like

Hey @warkolm i am seeing this circuit breaker exception in elasticsearch log files.This problem occurs when i use sigma rules..i am using elastalert for alerting https://posts.specterops.io/what-the-helk-sigma-integration-via-elastalert-6edf1715b02 in data node the memory size is 8gb and i have 3 data node`

I don't know how elastalert works, so it might be doing something usual.
You might need to increase the heap size to get around this.

How can i come to know that i should increase the memory of elasticsearch(data-node)
i have three data node with 16gb/node memory size half of the memory allocated to elasticsearch.
When i monitor the data node it consumes 6.9(data-node-)1 ,5.7(data-node-) and 6.5(data-node-) heap memory
data-node logs

[2020-12-20T18:33:37,612][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [ed3] GC did bring memory usage down, before [7142571744], after [7061764672], allocations [79], duration [61]
[2020-12-20T18:33:42,614][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [ed3] attempting to trigger G1GC due to high heap usage [7373009448]
[2020-12-20T18:33:42,659][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [ed3] GC did bring memory usage down, before [7373009448], after [7204475952], allocations [23], duration [45]
[2020-12-20T18:33:49,012][WARN ][o.e.m.j.JvmGcMonitorService] [ed3] [gc][5515] overhead, spent [1.4s] collecting in the last [2.2s]
[2020-12-20T18:33:51,130][WARN ][o.e.i.b.fielddata        ] [ed3] [fielddata] New used memory 3009727906 [2.8gb] for data of [_id] would be larger than configured breaker: 3006477107 [2.7gb], breaking

i followd this page but the same warning comes again
do i need to increase the memory?
and yes when i did this

GET /_nodes/stats
{
 "parent" : {
          "limit_size_in_bytes" : 6442450944,
          "limit_size" : "6gb",
          "estimated_size_in_bytes" : 4057777776,
          "estimated_size" : "3.7gb",
          "overhead" : 1.0,
          "tripped" : 422545
        }
}

This i did when i put this setting

PUT /_cluster/settings
{
  "persistent": {
    "indices.breaker.total.limit": "6GB"
  },
   "transient": {
    "indices.breaker.total.limit": "6GB"
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.