Indexing taking a lot of time due to GC overhead


#1

Issue: Indexing time drastically increased from 15 mins to 8 hrs for the same amount of data in past few weeks. We didn't really make any changes in particular to any of the nodes in the cluster. GC is taking a lot of time. [o.e.m.j.JvmGcMonitorService][gc][83699] overhead, spent [809ms] collecting in the last [1.2s]

Cluster details: 1 master & 2 clients. (16 GB memory), 6 data nodes (500 GB storage, 32 GB memory). Memory related settings on nodes include bootstrap.memory_lock: true, MAX_OPEN_FILES=65536,MAX_LOCKED_MEMORY=unlimited' in /etc/sysconfig/elasticsearch, and half memory allocated to elastic search on all nodes.

Has anyone this GC issue or have any suggestions on how we can resolve it?
Thanks in advance!


(Jason Tedor) #2

What version of 5.x are you running?


#3

I am using 5.2.x


(Jason Tedor) #4

Can you please share the JVM options of the running node? The easiest way to obtain this is by using jps -l -m -v on the running server and it will provide the JVM options for all running Java processes that your user has permissions to see.


#5

84265 org.elasticsearch.bootstrap.Elasticsearch -p /var/run/elasticsearch/elasticsearch.pid -Edefault.path.logs=/var/log/elasticsearch -Edefault.path.data=/var/lib/elasticsearch -Edefault.path.conf=/etc/elasticsearch -Xms16g -Xmx16g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Djdk.io.permissionsUseCanonicalPath=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -XX:+HeapDumpOnOutOfMemoryError -Des.path.home=/usr/share/elasticsearch
17295 sun.tools.jps.Jps -l -m -v -Dapplication.home=/opt/java/jdk1.8.0_112 -Xms8m


#6

@jasontedor I shared the jps -l -m -v output as you asked.


(Jason Tedor) #7

What is your indexing pattern? Are you using bulk requests? What is the size of the bulk payload? Do you use auto-generated IDs? Do you have any monitoring that shows heap usage over time?


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.