there is a elasticsearch cluster which version is 2.4.6 in our company. We met a strange problem recently. After Added nodes into this cluster, the client will met a large of number requests status become 499 or 502( now the cluster has finished rebalance), the requests are timeout. but after we remove the new nodes , this client become ok in less than two minutes.
the cluster version is 2.4.6, the client is
<groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>2.4.6</version>
the old machine's os is centos 6.10 and the new machine's os is centos 7.1. there are no any other differences between old nodes and new nodes.
and there are no error logs in cluster.
there are no hot nodes while the client has timeout request. I have checked the distribution of the shards, the shard balance and index balance are both ok.
do you meet some cases like this? can somebody help me? Or give me some ideas on how to troubleshoot this problem