Currently, I am running ES 1.7.1 with 8GB heap on a 16 CPU core machine. I am doing filter/Aggregation operation on demand. Node was good for few days but suddenly I started seeing GC logs
and all operations from java client started failing
org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes were available
.
.
.
Caused by: org.elasticsearch.transport.NodeDisconnectedException: [Node][inet[/127.0.0.1:9300]][indices:data/write/index] disconnected
and
[transport] (elasticsearch[Blob][generic][T#22]) [Blob] failed to get local cluster state for [Node][][localhost][inet[/127.0.0.1:9300]], disconnecting...: org.elasticsearch.transport.ReceiveTimeoutTransportException: [Node][inet[/127.0.0.1:9300]][cluster:monitor/state] request_id [308108] timed out after [15001ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:529) [elasticsearch-1.7.1.jar:]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_25]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_25]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_25]
Why does server become not responsive when it goe's on GC? It is supposed to be run in the background without disconnecting the client.
Your node is running out of memory and the JVM is having to do full gc's to try and free up space. Restart it. The reasons it could be running out of memory could be anything from just too much data for the available memory, a memory leak in ES, or many other potential reasons. There is a number of threads in this group that have solutions to issues like yours.
The general solution is to cluster with 3 nodes such that a loss of any single node won't kill the cluster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.