ElasticSearch 2.3.4 grinding to a halt

jdmac · August 11, 2016, 8:37am

Hi All,

I've got an issue with ElasticSearch, or more specifically I suspect with the JVM Heap it's using.

Before I go into detail overload, let me set out what I have.

The Setup

4 x Nodes (3 x Eligable masters & data noes, 1 x Client node).

Focusing on the master/data nodes they are running CentOS Linux release 7.2.1511 (Core), with java-1.8.0-openjdk-headless-1.8.0.91-1.b14.el7_2.x86_64 and ElasticSearch 2.3.4.

They are single core, 4 GB memory and 500 GB storage (specifically partitioned for the data of ES).

They are configured (correctly) to mlockall, the swappiness of the OS is 1 and the heap is set to 2 GB max and min.

On initial boot they seem to be working fine and for a little while they do.
Cluster shows healthy, and does so throughout.

Logging, to prevent swallowing storage of the root partition is set to INFO not DEBUG.

The Issue

The first symptom I start to experience, at least that I notice, is Kibana reporting that plugin:elasticsearch "Request Timeout after 30000ms".

I then notice slow responses to any APIs which need to poll the members, and I believe it's potentially stopping ElasticSearch processing correctly as a cluster.

The issue is occurring right now and if I try to run a direct /_cat API against the client node it takes quite some time to load.

When it has, it provides the following output - all good so far as I can tell:

host         ip           heap.percent ram.percent load node.role master name        
10.0.0.7 10.0.0.7            7          69 0.07 -         -      NODE7 
10.0.0.2 10.0.0.2           91          96 1.22 d         *      NODE2 
10.0.0.1 10.0.0.1           91          91 0.61 d         m      NODE1 
10.0.0.3 10.0.0.3           99          94 1.57 d         m      NODE3

Tailed logs from NODE1:

[2016-08-06 03:46:09,951][WARN ][monitor.jvm              ] [NODE1] [gc][young][48296][58732] duration [1.9s], collections [1]/[2.4s], total [1.9s]/[33.1m], memory [1.3gb]->[1.3gb]/[1.9gb], all_pools {[young] [18mb]->[4.8mb]/[66.5mb]}{[survivor] [4.7mb]->[7.9mb]/[8.3mb]}{[old] [1.3gb]->[1.3gb]/[1.9gb]}

Tailed logs from NODE2:

[2016-08-11 09:17:35,415][WARN ][transport                ] [NODE2] Received response for a request that has timed out, sent [81266ms] ago, timed out [66266ms] ago, action [cluster:monitor/nodes/stats[n]], node [{NODE3}{f29G8IbhSYevQcDPrDUqyg}{10.0.0.3}{10.0.0.3:9300}{max_local_storage_nodes=1}], id [4506258]
[2016-08-11 09:17:35,415][WARN ][transport                ] [NODE2] Received response for a request that has timed out, sent [141270ms] ago, timed out [126267ms] ago, action [cluster:monitor/nodes/stats[n]], node [{NODE3}{f29G8IbhSYevQcDPrDUqyg}{10.0.0.3}{10.0.0.3:9300}{max_local_storage_nodes=1}], id [4506107]

Tailed logs from NODE3:

[2016-08-11 09:12:34,800][INFO ][monitor.jvm              ] [NODE3] [gc][old][370335][68699] duration [6.4s], collections [1]/[6.6s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [59.3mb]->[56.2mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:12:46,014][INFO ][monitor.jvm              ] [NODE3] [gc][old][370337][68701] duration [6.6s], collections [1]/[6.7s], total [6.6s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [59mb]->[62.2mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}

The first thing I notice from the above log snippets is that NODE3 is being complained about in the NODE2 (current master) logs.
Looking at NODE3, I'd say something doesn't look right with how it's garbage collecting. My knowledge of working with JVM is very minimal and I'm trying to learn to support it correctly for ES.

So... Questions at this stage:

What triggers garbage collection?
Why only NODE3? All indices are shard balanced over the three nodes. Balancing is enabled.
Why did NODE1 have a warning on GC, NODE3 only INFO?
What can I do to prevent this? Is this a sign that I need to scale up/out?
- What should I be monitoring to predict that before we get to this stage?
Am I even looking at the right thing?

If I was to restart NODE3, I suspect it would restore service.
But only temporarily, and the last time this occurred it was NODE1 as master and NODE2 with the issue.

I've already looked at these threads for answers, which helped me in as much as writing my questions:

Any advice and answers to my questions is massively appreciated.

Best regards,

jdmac

jdmac · August 11, 2016, 8:37am

Forgot to add, I'm also reading through the following in the definitive guide.

https://www.elastic.co/guide/en/elasticsearch/guide/current/_monitoring_individual_nodes.html

jdmac · August 11, 2016, 8:47am

I may have actually answered my questions as to the current issue but any further input as a learning experience is very welcome.

I think the reason for NODE3 being unavailable right now is the pausing by the JVM GC, as when I look at the logs it's running non-stop.

[2016-08-11 09:10:12,922][INFO ][monitor.jvm              ] [NODE3] [gc][old][370309][68673] duration [6.3s], collections [1]/[6.5s], total [6.3s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [51.8mb]->[51.6mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:10:23,538][INFO ][monitor.jvm              ] [NODE3] [gc][old][370311][68675] duration [6.3s], collections [1]/[6.4s], total [6.3s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [57.9mb]->[54.1mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:10:34,539][INFO ][monitor.jvm              ] [NODE3] [gc][old][370313][68677] duration [6.3s], collections [1]/[6.6s], total [6.3s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [52.3mb]->[52.6mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:10:45,390][INFO ][monitor.jvm              ] [NODE3] [gc][old][370315][68679] duration [6.4s], collections [1]/[6.5s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [51.3mb]->[52.8mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:10:56,637][INFO ][monitor.jvm              ] [NODE3] [gc][old][370317][68681] duration [6.4s], collections [1]/[7s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [49.5mb]->[55.4mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:11:07,394][INFO ][monitor.jvm              ] [NODE3] [gc][old][370319][68683] duration [6.4s], collections [1]/[6.6s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [53.7mb]->[61.8mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:11:18,259][INFO ][monitor.jvm              ] [NODE3] [gc][old][370321][68685] duration [6.3s], collections [1]/[6.4s], total [6.3s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [60.2mb]->[52.2mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:11:28,982][INFO ][monitor.jvm              ] [NODE3] [gc][old][370323][68687] duration [6.5s], collections [1]/[6.6s], total [6.5s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [52.4mb]->[53.9mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:11:39,874][INFO ][monitor.jvm              ] [NODE3] [gc][old][370325][68689] duration [6.4s], collections [1]/[6.6s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [50.6mb]->[56.3mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:11:51,111][INFO ][monitor.jvm              ] [NODE3] [gc][old][370327][68691] duration [6.6s], collections [1]/[6.6s], total [6.6s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [66.5mb]->[52.9mb]/[66.5mb]}{[survivor] [5.3mb]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:12:02,033][INFO ][monitor.jvm              ] [NODE3] [gc][old][370329][68693] duration [6.4s], collections [1]/[6.5s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [53.4mb]->[56.7mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:12:13,029][INFO ][monitor.jvm              ] [NODE3] [gc][old][370331][68695] duration [6.4s], collections [1]/[6.5s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [62.4mb]->[59.3mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:12:23,845][INFO ][monitor.jvm              ] [NODE3] [gc][old][370333][68697] duration [6.5s], collections [1]/[6.5s], total [6.5s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [66.5mb]->[56.3mb]/[66.5mb]}{[survivor] [6.8mb]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}
[2016-08-11 09:12:34,800][INFO ][monitor.jvm              ] [NODE3] [gc][old][370335][68699] duration [6.4s], collections [1]/[6.6s], total [6.4s]/[1.7d], memory [1.9gb]->[1.9gb]/[1.9gb], all_pools {[young] [59.3mb]->[56.2mb]/[66.5mb]}{[survivor] [0b]->[0b]/[8.3mb]}{[old] [1.9gb]->[1.9gb]/[1.9gb]}

If I'm reading this right...
The old pool is not finding any 'dead' memory to GC and therefore constantly running GC in the hope it'll find something on next run, pausing the ElasticSearch operation on this node.

I'm guessing I need to provide more memory or an additional node to the cluster.

tanguy · August 11, 2016, 9:08am

Hi,

The JVM monitor log traces vary from DEBUG to WARN depending of how long the garbage collection took. That's why you can see different tresholds between NODE1 and NODE3.

It looks like your nodes need more RAM. Are you doing heavy aggregations? What's the output of GET /_nodes/stats?filter_path=nodes.*.indices?

Christian_Dahlqvist · August 11, 2016, 9:13am

Indexing and aggregations can use a fair bit of CPU, so just having a single core per node is likely insufficient.

jdmac · August 11, 2016, 9:17am

Hi Tanguy,

A few minutes ago I started to add more RAM to the cluster and increase the heap, starting with NODE3.

I think I started to answer my question, fortunately this cluster is actually all about getting familiar with ES ready for a more heavy production use, so a little breathing space thankfully.

I'm not aware of us doing heavy aggregations manually - Grafana/Kibana may be utilizing that though.

I think I should've spotted on the /_cat/nodes that the heap percent was too high.

I'm guessing since I've already started adding additional RAM, it may give out different readings to when the issue occurred on the output you asked?

Thanks,
jdmac

jdmac · August 11, 2016, 9:19am

Hi Christian_Dahlqvist,

True, very fair point - it's something I'm bearing in mind.
This is just a proof of concept.

The production environment will be far more 'beefy'.

Many thanks,

jdmac

Christian_Dahlqvist · August 11, 2016, 9:33am

Kibana uses aggregations, so it would be useful to monitor CPU usage while you are querying to see if this is actually a limiting factor or not.

jdmac · August 11, 2016, 9:43am

Cool, good to know. Thanks for the advice.

The issue now has been resolved... It was the heap stack use.

I suppose I should've read this sooner and got my head into the world of JVM.

Especially this part...

The heap_used_percent metric is a useful number to keep an eye on. Elasticsearch is configured to initiate GCs when the heap reaches 75% full. If your node is consistently >= 75%, your node is experiencing memory pressure. This is a warning sign that slow GCs may be in your near future.

If the heap usage is consistently >=85%, you are in trouble. Heaps over 90–95% are in risk of horrible performance with long 10–30s GCs at best, and out-of-memory (OOM) exceptions at worst.

Hopefully this thread proves helpful to other new users of Elasticsearch - It's a fantastic product and for the most part very intuitive.

tanguy · August 11, 2016, 9:49am

Glad you solved it!

Topic		Replies	Views
Elasticsearch dies every other day Elasticsearch	15	1627	July 6, 2017
Garbage collection Elasticsearch	13	8307	July 6, 2017
Garbage collection log messages, [monitor.jvm ... duration [2.2m] Elasticsearch	8	11044	July 6, 2017
Elasticsearch taking a long time for garbage collection Elasticsearch	6	2471	July 6, 2017
Elasticsearch heavy garbage collection Elasticsearch	2	572	July 6, 2017

ElasticSearch 2.3.4 grinding to a halt

The Setup

The Issue

Related topics