ElasticSearch nodes not responding anymore - please help!

Hi, I have the problem, that until some days of running the nodes of ES doesn't respond anymore until I restart them. I have installed for this reason Marvel to take a look into the system.

I am attaching the Marvel stats. I hope you can help me!

On the image you can see, that when the JVM Heap reaches > 75%, then the load is going high and the nodes are dead. They can not respond anymore.

I had in my settings for each node an heap of 12g - ES_HEAP_SIZE=12g . Now I have reduce it to 8g to see what will happen.

Does anyone has an idea where the problem can be? Do you need some other info from my settings?

Thanks
Nik

Is there anything interesting in the logs?

I have many logs from this:

[2016-09-06 23:21:32,056][WARN ][transport] [xxx-master-1] Received response for a request that has timed out, sent [19090ms] ago, timed out [4090ms] ago, actio

Are this graphs of the 2 nodes for the JVM Heap for the last 4 hours looking good?

Thanks
Nik

That looks like a fairly normal garbage collection cycle but you may want to ask the question of what aspect of your workload is driving up the heap utilization so quickly.

Hi kstaken, at the moment where you see the increase of the heap I am rebuilding the index. I create a new index, fill the data and delete the old one. This happen every day.

I am wondering, why the heap is not empty by the delete process. Should I somehow empty the heap in the moment where I am deleting the old index?

Here are my data from HQ. Do someone needs more info?

Please can someone help me? My cluster is not responding every 3 days and this is causing big problems on my sites.

Thanks
Nik

How much physical memory is on the node relative to the heap size? How many nodes? Master and data nodes separate or combined?

Maybe your reindexing process is simply running too aggressively and needs to be throttled back some.

Hi, here are my server and ES settings:

Cheers
Nik