Hi,
This blog post raises the value of the G1 garbage collector when heap sizes get large but recommends against using it because, at that time, there were issues with lucene index corruption...
Can I ask, what is the current state of play re: the readiness of G1 for garbage collecting elasticsearch nodes?
The last index corruption posts are from 2015 it seems...
Because of erratic responses from ES5, I run one node with G1GC. I've tried to put an equal load to it as to other cluster members.
BTW, this is our problematic doc get stat from ES 5.1.2 CMS vs. G1GC vs ES2 (CMS) running on openjdk8:
ES5 G1GC:
N Min Max Median Avg Stddev
x 6775 0.501383 7576.2033 1.466888 224.6603 776.50062
ES5 CMS:
N Min Max Median Avg Stddev
x 132832 0.367832 11799.925 1.481297 212.55076 844.80018
ES2 CMS:
N Min Max Median Avg Stddev
x 85480 0.377329 1578.3277 1.316108 3.1289925 17.710852
As it can be seen, ES5 has an average response time nearly two orders of magnitude(!) slower than ES2, but median tells the cause: ES5 behaves much more unpredictably and has a lot more slow responses.
CMS vs G1GC with ES5 shows mixed results: the max value has fallen with G1GC (but it's a much smaller data set), while average and median are nearly the same.
The interesting part would be checking the CPU usage of CMS vs G1GC, but it hasn't yet happened. And of course if there is a possibility of index corruption with G1GC, the whole point of switching to it is moot.
I wonder what's the official statement on this too.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.