The heap size on all our nodes is set to 30GB, but having read A Heap of Trouble, and looking at the graphs in the Kibana Stack Monitoring application, I believe our heap size is likely too large:
It appears from that graph that the node in question bottoms out around 5GB, and that, indeed, the periodicity of GC is quite long. How aggressively I would pursue reducing the allocated heap would depend on how the current usage compares to planned capacity, in terms of volume of retained data, indexing load, search load, number of indices.
If you believe that you are currently at about the expected usage in those terms for a reasonable time horizon (perhaps 3 - 6 months), then you could easily cut that node to, say, 15GB, and continue to monitor for the possibility to cut back more. You could conceivably go as low as 10GB.
If, on the other hand, this is a relatively new cluster and usage is expected to grow substantially in the near future, I would consider waiting until usage is more mature before adjusting - unless you are experiencing latency spikes during GC, in which case, of course, it should be a higher priority.
Thanks for confirming my suspicions that the heap size could probably do with being reduced, however, since we are currently on-boarding a number of different data streams, I'll revisit this once those are up and running.
This is the part that I don't understand. I allocated 30GB heap to my nodes as well. I don't have old GC so that means my RAM usage is pretty healthy. Young GC happens about once in 3 seconds.
But your system's RAM usage requires old GC, which means more intensive than my usage. But the recommendation is to reduce RAM allocation?
We are using G1GC, which improves significantly comparing to the old CMS GC.
Does GC algorithm play bigger role than the actual RAM usage of the cluster?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.