I have the following issue. Our largest datastream ("C") is responding poorly to the queries.
The main parts of the stack:
10*hot nodes (each: 16 cores, 60GB+ memory)
6*warm nodes (each: 16 cores, 60GB+ memory)
The datastream's specs ("C"):
retention on hot: 170 hours
retention on warm: 31 days
backing indices' size: ~1.4TB
30 primary shards per index
on average 3,200,000,000 documents per index
The allocation policy is 5 shards per node.
During the warm phase, the segments are force-merged to 1.
If I make a request for the past 5 days, I get a response in ~10 seconds. However, if I send the very same request for an interval where only the warm nodes are affected, it takes 100-200 seconds (it's a prod environment). When I ran a query for 30 days, it took 25 minutes to finish.
While I do understand that the hot nodes have more processing power (+4 VMs), those are constantly indexing data (IR: on average 100K/s for all nodes in the cluster). So, I don't think that "idle" nodes should respond this slowly.
I checked, and the CPU usage is well below 50%. I don't see issues with the JVM heap either (during one of the requests, I was monitoring it). Also, I/O isn't reaching half the maximum of the SSD's capacity. Naturally, the setup of all the nodes is the same; they're in the same DC, and there isn't any rouge process blocking ES on the warm nodes.
I'd really like to avoid throwing resources into the issue, which won't really solve anything but "hides" the issue, and naturally, money is a lot more finite resource.
Does anyone have any pointers on what can be done to speed up the search on the warm nodes? I respectfully don't care about the query performance (it can be optimized a little bit), since, as I told it's running 10-20 times faster on nodes that are a lot busier than the warm cluster. I'm happy to provide further information if possible and needed.
There are 30-40 fields per document (~36 on average).
Yep, all fields are mapped per type properly.
One shard is around 48GB.
It's a big prod cluster, so there is constantly something to do. At this moment, there were 40 tasks, a maximum of 4 on one node.
Regarding the query, it's a cardinality one that is resource-intensive, and it's a main KPI that we must have. In the past years (before 8.X and data streams), I started optimizing the query, and I'll do the last step soon (moving all possible filters to pre-processing). That's why I wrote that I don't want to focus on it.
I was doing some calculations, and it seems that while the hot nodes have only ~1.1 shards per node, the warm ones have ~8. Still, I feel like the resources aren't utilized on the warm nodes since I don't see peaks in any of the charts while these queries are running. This issue may have something to do with thread pools or something and can't be helped, but I wanted to ask around here, too, to see if I'm missing something.