I am continuously observing, my ES cluster goes into [GC] mode. Below is my cluster configuration,
3 datastore having below H/W configuration,
1: 8C/64GB/6TB
2: 8C/64GB/6TB
3: 8C/64GB/6TB
Current Cluster Stats:
Total No of Shards: 290 (including replica)
Primary Shards: 145
Heap Size Assigned: 25GB
Cluster Health: Green
Elasticsearch Version: 6.3.0
The above-mentioned cluster running in a proper manner without issue. Still cluster continuously going into Garbage Collection Mode and goes completely down.
Can anyone have any idea to run this cluster smoothly?
Elasticsearch Logs:
[2020-04-22T07:30:17,924][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][94883] overhead, spent [334ms] collecting in the last [1s] [2020-04-22T07:32:09,993][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][94995] overhead, spent [262ms] collecting in the last [1s] [2020-04-22T08:00:23,866][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96688] overhead, spent [383ms] collecting in the last [1s] [2020-04-22T08:00:51,894][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96716] overhead, spent [257ms] collecting in the last [1s] [2020-04-22T08:01:00,896][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96725] overhead, spent [296ms] collecting in the last [1s] [2020-04-22T08:01:01,896][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96726] overhead, spent [349ms] collecting in the last [1s] [2020-04-22T08:01:02,896][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96727] overhead, spent [332ms] collecting in the last [1s] [2020-04-22T08:20:23,717][INFO ][o.e.c.m.MetaDataMappingService] [ykmWSiI] [dsdb-20200422/K5EiSEzESiuuClJobrZe3Q] update_mapping [evt] [2020-04-22T08:30:16,736][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][98480] overhead, spent [318ms] collecting in the last [1s] [2020-04-22T08:32:16,901][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][98600] overhead, spent [380ms] collecting in the last [1s]
When I run /etc/init.d/elasticsearch status, I am able to see the elasticsearch status, as I mentioned earlier. No logs were recorded for that.
If I execute the localhost:9200 with the help of the health parameter then I am able to see the cluster health.
Currently, the cluster health is green but it is still in GC (garbage collection). For that, I shared the logs in my first comment.
Ok it's not really clear to me what is happening here sorry.
Your logs do not show that Elasticsearch is down at all
You get a response from the API during GC
I don't know the details of how init.d checks if the service is up, but if the API is working and you can see the process in a ps, then I think it's ok
Okay let me clear the scenario, please find my inline responses,
Elasticsearch is able to only show the GC logs.
Yes, I got the response during the GC state.
The elasticsearch is installed as service, hence I am able to check the elasticsearch status using init.d
Below is my problem. for which I am finding the solution,
The ES goes into GC mode frequently, I need to solve this GC problem. My final aim is to prevent the ES from GC.
Is there any setting that I need to configure in ES to prevent from the GC.
Elasticsearch Current Logs indicate GC state:
[2020-04-22T07:30:17,924][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][94883] overhead, spent [334ms] collecting in the last [1s] [2020-04-22T07:32:09,993][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][94995] overhead, spent [262ms] collecting in the last [1s] [2020-04-22T08:00:23,866][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96688] overhead, spent [383ms] collecting in the last [1s] [2020-04-22T08:00:51,894][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96716] overhead, spent [257ms] collecting in the last [1s] [2020-04-22T08:01:00,896][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96725] overhead, spent [296ms] collecting in the last [1s] [2020-04-22T08:01:01,896][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96726] overhead, spent [349ms] collecting in the last [1s] [2020-04-22T08:01:02,896][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][96727] overhead, spent [332ms] collecting in the last [1s] [2020-04-22T08:20:23,717][INFO ][o.e.c.m.MetaDataMappingService] [ykmWSiI] [dsdb-20200422/K5EiSEzESiuuClJobrZe3Q] update_mapping [evt] [2020-04-22T08:30:16,736][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][98480] overhead, spent [318ms] collecting in the last [1s] [2020-04-22T08:32:16,901][INFO ][o.e.m.j.JvmGcMonitorService] [ykmWSiI] [gc][98600] overhead, spent [380ms] collecting in the last [1s]
You cannot stop Elasticsearch from going GC, it's a totally normal thing.
Unless the GC is causing Elasticsearch to be unresponsive via the API, and is running for multiple seconds, which would be logging warnings, then there's nothing to worry about
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.