Questions about query latency

yeziblo · July 13, 2021, 4:09am

Hi everyone.

I have a strange problem.

We have an ES cluster with three nodes, each node's heap memory is 8 GB and the es version is 7.8.

Now I have two indexes: test@20210711125233 and test@20210712075544. There mappings are exactly the same, and their documents number is not much different(both around 26G with primary size). test@20210711125233 has 6 primary shards and test@20210712075544 has only 2 primary shards.

The number of segments for test@20210711125233 is 168 and test@20210712075544 has 53 segments.

Now I query these two indexes with same DSL, and test@20210712075544 took almost twice as long as test@20210711125233.

I was surprised that test@20210712075544 has fewer segments and fewer shards (I know that fewer shards is not always better, but considering the size of my index, two shards should be more suitable than 6 shards), why does its query take longer?

Here is some information in kibana:

test@20210712075544:

test@20210711125233:

I noticed that the latency of test@20210712075544 is much higher than test@20210711125233, but I am not sure if this is the cause of the problem.

In addition, what factors determine the index latency? How to effectively reduce the latency of indexing?

best regards.

Christian_Dahlqvist · July 13, 2021, 4:22am

When you query an index all shards are queried in parallel, but each shard is processed using a single thread for that query. Your index with 6 shards can therefore use 6 threads while the one with 2 only will use 2. If there are no other queries running at the same time you may be able to have all 6 threads run in parallel which could account for the difference in speed.

Just because the 6 shards are faster when you only have a single concurrent query does however not mean it will continue to be so once the number of concurrent queries increase. At some point requests may start to queue up and cause performance to drop. It is therefore important to test using the expected data volume and query load when you are optimizing shard size and count.

yeziblo · July 13, 2021, 5:11am

Thanks Christian, your answer is very helpful, I do only test one single query.

Christian_Dahlqvist · July 13, 2021, 5:54am

Are you expecting to have more concurrent queries in the cluster?

yeziblo · July 13, 2021, 6:13am

Seriously, this problem occurred in our test environment, there is no other concurrent query, so as you said, the index of 6 shards is faster.

But in production environment, concurrent queries are extactly exists, but I found that the query request is not queued through cat thread_pool api, so I am considering whether to change index of production environment from 6 shards to 2 shards to observe query speed.

our elasticsearch cluster in production environment has 7 nodes, and the result of thread_pool api like:

Considering that the search requests are not queued, maybe in my environment, 6 shards are indeed better?

system · August 10, 2021, 6:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
# of shards and filter latency Elasticsearch	2	460	May 3, 2017
Search latency & index latency for elasticsearch Elasticsearch	2	6085	November 5, 2018
Kibana field_stats request takes 20000 ms if number of Indices and shards are more Kibana	17	2599	July 6, 2017
Slow search query issue [5.5] Elasticsearch	6	674	September 22, 2017
Question about the relation between indices number and search latency Elasticsearch	4	532	January 7, 2020

Questions about query latency

Related topics