I have 4 EC2 machines in Elasticsearch Cluster.
Configuration: c5d.large, Memory: 3.5G, Data Disk: 50GB NVME instance storage.
Elasticsearch Version: 6.8.21
I added the 5th machine with the same configuration c5d.large, Memory: 3.5G, Data Disk: 50GB NVME instance storage. After that, Search requests are taking more time than earlier. I enabled slow logs, which shows only shards that are present on the 5th node are taking more time for search. Also, I can see high disk Read IO happening on new node when I trigger search requests. The iowait% increases by the number of search requests and goes up to 90-95%. All old nodes do not show any read spikes.
I checked elasticsearch.yml, jvm.options and even sysctl -A configurations. there is no diff between config on new nodes vs old nodes.
Shard shuffling was already completed. I also waited for 20 minutes for the CPU to stabilize. then, I triggered search requests. Also, Disk Reads spikes only when I trigger search requests and only on new machine.
I tried with a new VM. Getting the same issue. So there is nothing wrong with the provisioned VM.
Well, I found diff in lscpu in old vs new vms. New VM has better CPU.
Please find below lscpu output. Also, new VM has one extra cpu flag - invpcid_single and it does not have hle and rtm flags compred to old VMs.