Find out what causes long GC cycles on client node


my client nodes are experiencing very long GC cycles lately.

Is there any way to narrow down the cause of this?
We are using kibana most of the time, and usually dont query ES any other way.

Our client nodes are running with 6GB Heap.
They tend to toggle between 30% and 75% with, the occasional 80% usage.

At some point, the heap usage rises until either an OOM or every long GCs kill the node.

Besides the GCs, the logs don't show any other relevant information


This might be useful references:

For me, it was almost always fielddata. Look for evictions in that and filterdata. _node/stats and/or marvel or other monitor is your friend for troubleshooting this.

If you are not using doc_values, it can help (at some indexing cost).

6GB of how much in total? You can easily increase heap for dedicated client nodes to 75% of total system memory as you don't need to worry about FS caching.

Otherwise the only reason clients will hit lots of GC is because you are doing more/larger queries through them.

We have our client node running in parallel to one of our data nodes.

Server: 64GB
Data: 30GB
Client: 6GB

We are still on 1.7, therefore heap is more important than filesystem cache.

I'm aware that this setup is far from perfect, but we have to work with a fixed budget.