Hi Guys,
I am facing the below issue since few days now -:
"Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [24876638648/23.1gb], which is larger than the limit of [24481313587/22.7gb], real usage: [24864230864/23.1gb], new bytes reserved: [12407784/11.8mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=12407784/11.8mb, accounting=25462489/24.2mb]"
Earlier I had this issue , so I looked into the Es discussion forum and found out I need to scale up the memory. Earlier I was using heap as 12 GB but now I doubled it but still getting this issue.
Here is the configuration for ElasticSearch -:
version: '3.4'
services:
elasticsearch:
image: ${REGISTRY}/elastic/elasticsearch:7.3.2-431
environment:
- cluster.name=mgmt-elasticsearch-cluster
- bootstrap.memory_lock=true
- discovery.zen.minimum_master_nodes=3
- cluster.initial_master_nodes=msql07,msql08,msql09,msql10,msql11,msql12
- SERVICE_NAME=elasticsearch
- TAKE_FILE_OWNERSHIP=true
- ES_JAVA_OPTS=-Xms24g -Xmx24g -XX:-UseConcMarkSweepGC -XX:-UseCMSInitiatingOccupancyOnly -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=75
- HOSTNAME_COMMAND=curl -H Metadata:true -s http://169.254.169.254/metadata/instance?api-version=2019-06-04 | jq -r '.compute.name'
labels:
com.bnsf.mp.description: "ElasticSearch database"
com.bnsf.mp.department: "XF"
logging:
driver: "json-file"
networks:
- logging
volumes:
- type: bind
source: /opt/data/elasticsearch
target: /usr/share/elasticsearch/data
- type: tmpfs
target: /usr/share/elasticsearch/logs
deploy:
labels:
traefik.enable: "true"
traefik.port: "9200"
traefik.frontend.rule: "Host:kibana.xyz.com;PathPrefixStrip:/elasticsearch/"
traefik.frontend.entryPoints: "https"
traefik.docker.network: "logging"
mode: global
endpoint_mode: dnsrr
placement:
constraints: [node.labels.type == sql]
resources:
limits:
cpus: '4.0'
memory: 48G
reservations:
cpus: '2.0'
memory: 24G
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
update_config:
parallelism: 1
delay: 60s
failure_action: rollback
monitor: 180s
max_failure_ratio: 0.25
Due to this Circuitbreaker issue, Kibana is also dying after few hours and ES keeps on un-assigning shards due to which cluster status becomes Yellow.
Attaching few monitoring screenshots from Grafana.
jvm.option is below , but I set heap and other few configs through env (ES_JAVA_OPTS=-Xms24g -Xmx24g -XX:-UseConcMarkSweepGC -XX:-UseCMSInitiatingOccupancyOnly -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=75) -:
JVM configuration
################################################################
IMPORTANT: JVM heap size
################################################################
You should always set the min and max JVM heap
size to the same value. For example, to set
the heap to 4 GB, set:
-Xms4g
-Xmx4g
See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
for more information
################################################################
Xms represents the initial size of total heap space
Xmx represents the maximum size of total heap space
-Xms1g
-Xmx1g
################################################################
Expert settings
################################################################
All settings below this section are considered
expert settings. Don't tamper with them unless
you understand what you are doing
################################################################
GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
G1GC Configuration
NOTE: G1GC is only supported on JDK version 10 or later.
To use G1GC uncomment the lines below.
10-:-XX:-UseConcMarkSweepGC
10-:-XX:-UseCMSInitiatingOccupancyOnly
10-:-XX:+UseG1GC
10-:-XX:InitiatingHeapOccupancyPercent=75
DNS cache policy
cache ttl in seconds for positive DNS lookups noting that this overrides the
JDK security property networkaddress.cache.ttl; set to -1 to cache forever
-Des.networkaddress.cache.ttl=60
cache ttl in seconds for negative DNS lookups noting that this overrides the
JDK security property networkaddress.cache.negative ttl; set to -1 to cache
forever
-Des.networkaddress.cache.negative.ttl=10
optimizations
pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch
basic
explicitly set the stack size
-Xss1m
set to headless, just in case
-Djava.awt.headless=true
ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8
use our provided JNA always versus the system one
-Djna.nosys=true
turn off a JDK optimization that throws away stack traces for common
exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow
flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
heap dumps
generate a heap dump when an allocation from the Java heap fails
heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError
specify an alternative path for heap dumps; ensure the directory exists and
has sufficient space
-XX:HeapDumpPath=data
specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log
JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT
Can someone help me to resolve this issue?