I have weird issue after migrating from 2.4 to 6.3.
My cluster consists of 10 nodes:
3 master nodes, (m3.medium, 1CPU/4GB RAM)
3 client nodes running ES, Logstash and Kibana. (c5.xlarge 4xCPU/8GB RAM)
4 data nodes (r4.2xlarge, 61 GB RAM/8 CPU)
Data inflow looks like this:
Load balancer sends data to logstash TCP listener
Kibana and Logstash sends data and queries to localhost client node, which then supposed to load balance them to data nodes.
The problem here is that I see queries take much longer than they used to in 2.4 and Datadog reports that one of the nodes gets more queries than others
I don't understand why...Can anybody help here?
I'm continue to check for any differences in configuration
I'm using routing awareness, based on availability zone, and I just realized that node-0 and node-3 are in the same AZ, while others are in their own... I will re-deploy new instance in a 4th zone and see if that helps