Elasticsearch 7.7.0 shows regression when benchmarked against 7.2.0 for the following challenges in es-rally:
Es-Rally Version: 2.0
Track: http_logs
Challenge: append-sorted-no-conflicts
Total Regressions: 18
ES Versions: 7.7.0 vs 7.2.0
Total ES nodes: 2 (Data, Master)
ES Instance Type: 16 cores, 64GB
The regression was consistently observed for this specific challenge in http_logs track.
The 7.7.0 shows improvements in the other challenges like: append-no-conflicts and append-no-conflicts-index-only
I had a look at the results of the nightly regression tests and did not see any regression. I do however not have the details exactly what is behind these numbers. Can you perhaps provide some additional information about the environment where you saw this regression?
Are you using local SSDs? Indexing is I/O intensive so it would be good to get a full picture of how your environment compares to the one running the nightly benchmarks.
Is this the only benchmark where you see a regression? Are all other benchmarks consistent with previous versions in your environment?
Yes, all races see performance improvements with 7.7.0 except the http_logs track with challenge append-sorted-no-conflicts
Are you using local SSDs? Indexing is I/O intensive so it would be good to get a full picture of how your environment compares to the one running the nightly benchmarks
Not using local SSD's. Performance is good enough for our requirements
Can you monitor disk utilization and iowait for this benchmark and compare the two versions? I wonder if this benchmark might have resulted in increased I/O, which has a more notable effect in your cluster than the one that runs nightly benchmarks as it is using fast, local SSDs.
@dliappis Are you monitoring and comparing resource usage, e.g. disk IOPS, for the nightly benchmarks?
Reading the thread, one major change in Elasticsearch 7.7.0 is the change to Java 14 which forced the use of G1GC (by removing the CMS garbage collector); do you use the bundled JDK or explicitly setting your own JAVA_HOME? I suspect the former, but anyway looking at the release benchmarks for 7.8.0 in Elasticsearch Benchmarks (see sorted-4g median indexing throughput) I don't see a significant regression even if the heap is a bit reserved at 4g.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.