Reducing heap size with ES 2.1

What size heap are you using?

We are using Elasticsearch 2.1 after recently upgrading from 1.7. We are using 128GB heaps because we used to get Java out-of-memory errors. Now what we're on 2.1, we are trying to resolve a timeout problem we've been experiencing. It was recommended that we go from 128GB to 30GB for heap.

OK, I'm willing to give that a go.

Aren't we about to switch one problem for another? What is the implication of going to a smaller heap given we had so many out-of-memory issues before. Are there OOM java issues which have been resolved since ES 1.7?

A very important change between 1.x and 2.x that enables Elasticsearch to use far less heap space is the use of doc values for sorting and aggregations, as opposed to the 1.7 default of field data cache. The field data cache used to be a huge memory consumer and cause of OOM errors. Doc values are the default in 2.x (they were in 1.x but not the default) and save a huge amount of heap space, as doc values are an on-disk data structure. The performance hit is negligible compared to field data cache, and the advantage is that if you give less memory to the Java heap for Elasticsearch, you are giving more memory to the operating system to use for its file system cache, thereby gaining huge performance improvement for doc values and any other Lucene level files that Elasticsearch is using.

See this link for more details: https://www.elastic.co/guide/en/elasticsearch/guide/1.x/doc-values.html

This is also a good blog post by one of our engineers on Java heap usage in Elasticsearch (and heap usage in general): https://www.elastic.co/blog/a-heap-of-trouble

You really need to run a number of smaller nodes, rather than one single massive one.