We got the following two exceptions today in our master node
[2020-06-06T07:52:07,974][WARN ][o.e.t.TransportService ] [es7advcl02-01] Received response for a request that has timed out, sent [26018ms] ago, timed out [16011ms] ago, action [internal:coordination/fault_detection/follower_check], node [{es7advcl02-02}{Z9H7777WSCecxRWxw-Vgtg}{PfLZ3HshSmm2w3Abyw7djQ}{xxxx}{xxxx:xxxx}{ml.machine_memory=8352976896, ml.max_open_jobs=20, xpack.installed=true}], id [31175861]
[2020-06-06T07:52:07,974][WARN ][o.e.t.TransportService ] [es7advcl02-01] Received response for a request that has timed out, sent [15011ms] ago, timed out [5003ms] ago, action [internal:coordination/fault_detection/follower_check], node [{es7advcl02-02}{Z9H7777WSCecxRWxw-Vgtg}{PfLZ3HshSmm2w3Abyw7djQ}{xxxx}{xxxx:xxxx}{ml.machine_memory=8352976896, ml.max_open_jobs=20, xpack.installed=true}], id [31176001]
An equivalent leader_check error was there in es7advcl02-02 as well
During this time, we did see a blip in our iowait stats & there were a couple of 503/504 errors as well. I was wondering what this error is? it's a WARN log, but it affected the stability of our cluster looks like (503/504 errors)
Our cluster state is around 426KB