Hello,
After upgrading ES 7.16.2 --> ES 8.11.4, search queries which were running fine on ES7 failed in ES8. This is unexpected for me, especially when a new cluster has better resources
The setup of data nodes:
- ES7 -> 4 data nodes, each ~4GB memory for ES and ~4GB for OS (Ubuntu 20)
- ES8 -> 4 data nodes, each ~8GB memory for ES and ~8GB for OS (Ubuntu 22)
The queries are heavy, but they work on ES7, making me think it's very ES8 related. I checked the settings on both clusters for breakers
- no difference.
The actual error is:
"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"product_34_126_t","node":"tf_L3iH8T8igkOzQMSBKDw","reason":{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<reused_arrays>] would be [7964077776/7.4gb], which is larger than the limit of [7961208422/7.4gb], real usage: [7958064848/7.4gb], new bytes reserved: [6012928/5.7mb], usages [eql_sequence=0/0b, fielddata=18659348/17.7mb, request=9007952/8.5mb, inflight_requests=5090/4.9kb, model_inference=0/0b]","bytes_wanted":7964077776,"bytes_limit":7961208422,"durability":"PERMANENT"}}]
Here are some logs from that node (there are many of them, all of the same type):
[2024-05-08T15:58:49,871][INFO ][o.e.m.j.JvmGcMonitorService] [eu-test-dataHorse-1] [gc][2619] overhead, spent [268ms] collecting in the last [1s]
[2024-05-08T15:58:52,686][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] attempting to trigger G1GC due to high heap usage [8297270888]
[2024-05-08T15:58:52,702][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] memory usage down after [0], before [8297270888], after [8261635704]
[2024-05-08T15:58:52,702][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] GC did bring memory usage down, before [8297270888], after [8261635704], allocations [17], duration [17]
[2024-05-08T15:58:52,710][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] memory usage not down after [8], before [8299384440], after [8299384440]
[2024-05-08T15:58:52,710][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] memory usage not down after [8], before [8299384440], after [8299384440]
[2024-05-08T15:58:52,716][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] memory usage not down after [14], before [8299384440], after [8299384440]
[2024-05-08T15:58:52,717][INFO ][o.e.i.b.HierarchyCircuitBreakerService] [eu-test-dataHorse-1] memory usage not down after [15], before [8299384440], after [8299384440]
I appreciate any ideas