Elasticsearch version: 1.7.3 - 2.3.3
JVM version: "1.8.0_72"
OS version: Ubuntu 14.04
Description of the problem including expected versus actual behavior:
Running ES on a 5 node cluster cluster where each node has 15GB of RAM. The max heap size is set to 7GB on each server. Using plugins searchguard-ssl, stats, and head.
This is a screenshot of the heap usage on each node over a 10 day period. You can see the normal saw-tooth pattern, but the bottom of each cycle is rising. It seems that the garbage collector frees less and less memory every time that it runs. I was having a similar issue on ES 1.7.3 and hoped that bumping to 2.3.3 would help to alleviate, but no such luck.
Once the servers get up to ~6GB of heap used for ES we do a rolling restart of the cluster which restarts the whole cycle.
I recently posted this issue on github(https://github.com/elastic/elasticsearch/issues/19544) and was directed to the page on limiting memory usage(https://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html). The documentation leads me to believe that the field data is likely the culprit of our endlessly growing heap, however, the values for total field data used do not add up.
When I run the GET /_stats/fielddata?fields=*
on my cluster I get:
"total":{
"fielddata":{
"memory_size_in_bytes":4739127224,
"evictions":0,
"fields":{
"@tag":{
"memory_size_in_bytes":42626416
},
"user_id":{
"memory_size_in_bytes":14236968
},
"time":{
"memory_size_in_bytes":8049808
},
"buildNum":{
"memory_size_in_bytes":816
},
"created":{
"memory_size_in_bytes":1490114912
},
"@type":{
"memory_size_in_bytes":21128028
},
"@timestamp":{
"memory_size_in_bytes":3162970276
}
}
}
}
Based on these results it would seem that the total amount of memory used on field data is ~4.7GB. Based on the per node results I assume that this total is spread across all nodes in the cluster. But a segment of 4.7GB out of a total heap of 35GB (7GB per node across 5 nodes) doesn't seem like it would be the sole cause of such steep growth in our heap.
Is there something else that I may be overlooking with regards to the fielddata effect on heap usage?