I am running a complex search, it takes a while to complete, and I am trying to figure out why. I ran the search through the profiler, and it looks like the total search time, is much greater than the individual index times.
- Elasticsearch 7.10.1
- Cluster has 40 nodes, each with an 8 core CPU, and 64 GB of memory, with 31GB devoted to heap.
- Roughly 40 shards... most nodes only have a single shard.
- The shards are big, ranging from 20GB to 60GB.
- Searches tend to hit all of the shards across all nodes.
- Notice in the image below, shard searches are relatively fast (<1S), but the overall search is slow (>10S)
A few questions:
- What exactly is the cumulative time? Is it all of the individual times summed up, or does it represent the total time of the search?
- Why would the total time be so high when the individual times are so low?
- How does one debug slow searches (>10s) when the individual shard searches are fast (<1 sec)?
- Possible solution: Am I limiting myself in any way by having a single shard on a node? Would splitting up the shards so there are more shards per node increase parallelism in any way?