Hi !
We upgraded from 2.4 to 7.11, we are having below issues all around the ES.
- ES data nodes are going down alternatively, and it is due to Heap Out of memory issues.
- Today morning we did re-indexing around 100k/s and we were good at that instance. But at one point of time we cannot do indexing anymore. All our queues are redelivered due to Max ES timeout exceptions.
We have 50 data nodes.
Any thoughts on what may be causing this would be appreciated !!
Log:
Data node Log before it went down.
[2021-06-30T13:54:22,381][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [
xxxx
] fatal error in thread [elasticsearch[
xxx
][search][T#4]], exiting
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.packed.Packed8ThreeBlocks.<init>(Packed8ThreeBlocks.java:41) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:965) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:941) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
at org.apache.lucene.util.packed.GrowableWriter.ensureCapacity(GrowableWriter.java:80) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
at org.apache.lucene.util.packed.GrowableWriter.set(GrowableWriter.java:88) ~[lucene-core-8.7.0.jar:8.7.0 2dc63e901c60cda27ef3b744bc554f1481b3b067 - atrisharma - 2020-10-29 19:35:28]
at org.elasticsearch.index.fielddata.ordinals.OrdinalsBuilder$OrdinalsStore.firstLevel(OrdinalsBuilder.java:176) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.index.fielddata.ordinals.OrdinalsBuilder$OrdinalsStore.addOrdinal(OrdinalsBuilder.java:167) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.index.fielddata.ordinals.OrdinalsBuilder.addDoc(OrdinalsBuilder.java:312) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData.loadDirect(PagedBytesIndexFieldData.java:136) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.index.fielddata.plain.PagedBytesIndexFieldData.loadDirect(PagedBytesIndexFieldData.java:47) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache.lambda$load$0(IndicesFieldDataCache.java:135) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache$Lambda$6309/0x0000000801c75ec8.load(Unknown Source) ~[?:?]
at org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:423) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache.load(IndicesFieldDataCache.java:132) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.index.fielddata.plain.AbstractIndexOrdinalsFieldData.load(AbstractIndexOrdinalsFieldData.java:82) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.index.fielddata.plain.AbstractIndexOrdinalsFieldData.load(AbstractIndexOrdinalsFieldData.java:33) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.support.ValuesSource$Bytes$WithOrdinals$FieldData.globalOrdinalsValues(ValuesSource.java:208) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.support.ValuesSource$Bytes$WithOrdinals.globalMaxOrd(ValuesSource.java:180) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory.getMaxOrd(TermsAggregatorFactory.java:281) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory.access$000(TermsAggregatorFactory.java:40) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory$1.build(TermsAggregatorFactory.java:85) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory.doCreateInternal(TermsAggregatorFactory.java:230) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.support.ValuesSourceAggregatorFactory.createInternal(ValuesSourceAggregatorFactory.java:36) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:63) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.AggregatorFactories.createSubAggregators(AggregatorFactories.java:187) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.AggregatorBase.<init>(AggregatorBase.java:64) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.BucketsAggregator.<init>(BucketsAggregator.java:47) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.DeferableBucketAggregator.<init>(DeferableBucketAggregator.java:35) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.sampler.SamplerAggregator.<init>(SamplerAggregator.java:164) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.bucket.sampler.SamplerAggregatorFactory.createInternal(SamplerAggregatorFactory.java:33) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.AggregatorFactory.create(AggregatorFactory.java:63) ~[elasticsearch-7.11.2.jar:7.11.2]
at org.elasticsearch.search.aggregations.AggregatorFactories.createSubAggregators(AggregatorFactories.java:187) ~[elasticsearch-7.11.2.jar:7.11.2]
(END)
Error 2 : The remote server returned an error: (429) Too Many Requests.. Call: Status code 429 from: POST /feed_XXXX/_update/56246%7C1%7C1409625889747439616?if_primary_term=1&if_seq_no=3102960. ServerError: Type: circuit_breaking_exception Reason: "[parent] Data too large, data for [indices:data/write/update[s]] would be [17041414328/15.8gb], which is larger than the limit of [16320875724/15.1gb], real usage: [17041410408/15.8gb], new bytes reserved: [3920/3.8kb], usages [request=0/0b, fielddata=2388112750/2.2gb, in_flight_requests=5126/5kb, model_inference=0/0b, accounting=4148192/3.9mb]"
Configuration:
ES version : "7.11.2",
ES Configuration:
**elasticsearch.yml**
bootstrap.memory_lock: true
cloud.node.auto_attributes: true
cluster:
name: XXXXXXX
routing.allocation.awareness.attributes: aws_availability_zone
discovery:
seed_providers: ec2
ec2.groups: XXXXX
network.host: XX.XXX.X.XXX
node:
name: ${HOSTNAME}
roles : [ data ]
http.max_content_length: 200mb
siren.memory.root.limit: 2147483647
**Jvm Options.yml**
# Xmx represents the maximum size of total heap space
-Xms16g
-Xmx16g
-Dsiren.io.netty.maxDirectMemory=2147483648
## GC configuration
-Des.networkaddress.cache.ttl=60
-Des.networkaddress.cache.negative.ttl=10
# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch
# explicitly set the stack size
-Xss1m
# set to headless, just in case
-Djava.awt.headless=true
# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8
# use our provided JNA always versus the system one
-Djna.nosys=true
# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow
# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:-HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/lib/elasticsearch
-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/var/log/elasticsearch/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
#JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m
9-:-Djava.locale.providers=COMPAT
Thanks!!