Question about the relation between indices number and search latency

[System ENV]

  • instance : 6ea
    : each instance -> 16 cpu, 32GB Ram, 1TB SSD
  • node : 6ea
  • indices
    : total indices number is 75
    -> each index has only 1 shard and about 600,000 docs (2~3GB)
    replica number for each index is 5 (i.e., all nodes have primary or replica shard for each index)

We have done search traffic test with the below scnearios without any indexing job on the above env.
(search request type is dfs_query_then_fetch, and disbaled query cache and requests cache)

Case 1. Send search requests to only 25 indices -> search latency : 40ms
Case 2. Send search requests to all 75 indices -> search latency : 190ms

Amount of search traffic is same, just the number of target indices for search requests is different.
But there is much difference of search latency between case 1 and 2.
We have found disk IOPS has increased much more for Case 2 and we guess it causes overhead of search latency.
Could we know why disk IOPS has increased compared with Case 1?

For each index queried data structures on disk need to be accessed and matched documents retrieved. Unless all files are cached in the operating system page cache, which is not the case in your scenario, querying more indices is likely to led to increased disk I/O as more files need to be read than fit into the page cache.

Given that your shard size is small it may make sense to try reducing the number of indices and then try experimenting with the number of replicas held. If you have fewer replicas, the total data volume held shrinks, which means more of the data can be cached. If you are not indexing new data you can also try forcemerging the indices down to a single segment to imporove serach performance further.

It may also be worthwhile monitoring your heap usage and potentially try to reduce the heap size if you have room to spare. This will leave more room for the operating system cache, which can reduce disk I/O.

Thank you for your answer!!

As I see, every search request makes search context and use IndexReader to open/read segment files for query/fetch steps.
Is the data of segment files also stored into page cache area?

And your solution seems be very good (reduce the heap size).
We would try to reduce the current heap size or upgrade system(e.g., 32GB->64GB RAM).

We have tested with reduced heap size and search latency has also been decreased.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.