We've been working on an upgrade from Elasticsearch cluster running v6.2.3 to v7.13.1 and running into a major performance roadblock.
We are running the versions side-by-side and trying to cut-over to the new version but when we push the load above 10% of the overall volume of our traffic the new Elasticsearch v7 cluster performance degrades drastically. The query time from the v7 cluster jumps to be 3x worse than the v6 performance.
If we drop the load down back to 10%, the query latency recovers and the v7 version runs at 3x better than the v6 version. This seems to indicate some kind of bottleneck we're facing with the way the cluster is handling the requests.
I've been reading through the documentation and looking for any settings that might cause this behavior.
I was hoping to post here and see if anyone had experienced something similar with suggestions of what to try to remediate the issue.