Slower query performance after upgrade to 6.2.0 ES

(Tosh) #1

Hi ,

We recently upgraded from elastic 2.3 to 6.2 . The search queries seem to be taking more response time than before for the same number of requests .
The difference from the config perspective that we saw was that in 2.3 tcp compress was true by default with LZF algo used for compression where as in 6.2 its false by default and deflate compression is performed .

We disabled tcp.compress but still don't see any improvement in performance . Presently we have 6 nodes in a cluster with 39.36GB disk space per node .

Is there any other config that one must look at to improve performance ? Since hardware wise there isn't any change as such . 2.3 ES had better response time .

(Daniel Mitterdorfer) #2


that's really hard to tell so I can only provide some pointers that should help you address this.

I suggest that you start by setting up targeted experiments (i.e. benchmarks). We use Rally for that but you can also choose other tools you're familiar with. Next, I'd analyze system behavior e.g. by attaching a profiler while running a benchmark or sampling Elasticsearch APIs like the node stats API that provides more insight what Elasticsearch is doing. You can also analyze behavior on system level e.g. with the USE method to analyze what's your bottleneck.

As benchmarking can be quite tricky, we have written a blog post with Seven Tips for Better Elasticsearch benchmarks.


(Akshat) #3

We did some further analysis of the issue as mentioned by @Tosh above and providing some more details around the issue below.

so what we are seeing is that there seems to be some sort of bottleneck when transport MultiSearch Requests count increases from our JAVA client. As the multiSearch Request count increases the response time starts to slow down.
For example, our single multi search API generally contains on an average of 12 queries per request. Individually, this multi search API request takes around 1 second to complete but as we increase the concurrent multiSearch requests the response times jumps to 3sec, 5 seconds and we have even seen it go upto 20 seconds.
Each query above needs to query around 290 shards (multiple indices) in the cluster.

We also see that search thread pool is not completely getting utilized and not reaching its maximum value which is 25 in our case as we are running 6 data node cluster with ES version 6.2.2. Each data node has 16 GB RAM and 16 CPU's.
Checking the load and CPU on servers indicates that servers are not even under heavy load at any time of the day and that relates to the fact that thread pool might be under utilized.

Are there any setting or properties that were introduced after ES 2.3 that we should be setting to better utilize the cluster and to avoid this bottleneck ?
We saw couple of parameter on search API's and multi search API's, max_concurrent_shard_requests and max_concurrent_searches that were introduced post 2.3.
Default value for max_concurrent_shard_requests in our case is "30" as we have 6 nodes and 5 shards per indices. We tried to increase this number 200 as well but did not see much improvement in the bottle neck. We used Jmeter to generate load via the JAVA client.

Any suggestions would be helpful.

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.