Elastic search cluster details -
3x Master only nodes
3x Data nodes - 8GB Ram 8CPUs
1x Coordinator Node - 8GB Ram - 6GB Heap
Heap size - 20GB
Documents - 401,859,738
Indicies - 249
Primary Shard - 745 - 3 per Indicie.
Daily Indicies for logs, average Indicie size is 500MB-1GB (Around 20 are 10GB+)
I am having an issue when search far back (Around 6 months+) I am getting time outs/shard failed errors. There is not really anything useful in the elastic search logs except this (Occasionly not every time)
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<agg >] would be [4031321656/3.7gb], which is larger than the limit of [4013975142/3.7gb], real usage: [4031316536/3.7gb], new bytes reserved: [5120/5kb]
Sometimes when I get the timeouts, if i then click refresh the data loads straight away, so it almost like its nearly loading it/caching it and then showing.
I am just looking for some advice on improving performance.
What I am thinking atm is to force merge all Indicies - Will this help?
Will increase my nodes help? Since all my Indicies have 3 shards, I dont see how scaling up my clusters could help unless I re-index and change the amount of shards to match the node count.