Question about the relation between indices number and search latency

Hyunsoo_Shim · December 6, 2019, 2:58am

[System ENV]

instance : 6ea
: each instance -> 16 cpu, 32GB Ram, 1TB SSD
node : 6ea
indices
: total indices number is 75
-> each index has only 1 shard and about 600,000 docs (2~3GB)
replica number for each index is 5 (i.e., all nodes have primary or replica shard for each index)

We have done search traffic test with the below scnearios without any indexing job on the above env.
(search request type is dfs_query_then_fetch, and disbaled query cache and requests cache)

Case 1. Send search requests to only 25 indices -> search latency : 40ms
Case 2. Send search requests to all 75 indices -> search latency : 190ms

Amount of search traffic is same, just the number of target indices for search requests is different.
But there is much difference of search latency between case 1 and 2.
We have found disk IOPS has increased much more for Case 2 and we guess it causes overhead of search latency.
Could we know why disk IOPS has increased compared with Case 1?

Christian_Dahlqvist · December 6, 2019, 5:31am

For each index queried data structures on disk need to be accessed and matched documents retrieved. Unless all files are cached in the operating system page cache, which is not the case in your scenario, querying more indices is likely to led to increased disk I/O as more files need to be read than fit into the page cache.

Given that your shard size is small it may make sense to try reducing the number of indices and then try experimenting with the number of replicas held. If you have fewer replicas, the total data volume held shrinks, which means more of the data can be cached. If you are not indexing new data you can also try forcemerging the indices down to a single segment to imporove serach performance further.

It may also be worthwhile monitoring your heap usage and potentially try to reduce the heap size if you have room to spare. This will leave more room for the operating system cache, which can reduce disk I/O.

Hyunsoo_Shim · December 6, 2019, 8:02am

Thank you for your answer!!

As I see, every search request makes search context and use IndexReader to open/read segment files for query/fetch steps.
Is the data of segment files also stored into page cache area?

And your solution seems be very good (reduce the heap size).
We would try to reduce the current heap size or upgrade system(e.g., 32GB->64GB RAM).

Hyunsoo_Shim · December 10, 2019, 11:46am

We have tested with reduced heap size and search latency has also been decreased.

system · January 7, 2020, 11:46am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch high latency Elasticsearch	16	3409	June 8, 2023
Questions about query latency Elasticsearch	5	618	August 10, 2021
High search_fetch_time for elasticsearch cluster Elasticsearch	2	412	March 15, 2023
How does shards-per-node contribute to indexing latency/throughput? Elastic Search	12	32	October 1, 2024
Is there some tests/articles to describe the relation between index performance/shard number and search performance/shard number under certain resource? Elasticsearch	5	635	July 5, 2017

Question about the relation between indices number and search latency

Related topics