Elk loses contact with the master every morning at 8am and the cluster turns red

hello Our cluster will turn red after 8 am every day. The cluster size is 6 hot data nodes 3 warm data nodes. The primary node is the same as the hot data node. Recently, we found a strange phenomenon that the cluster status will turn red at 8:12 every day View master logs :[esdata01] current.health="RED" message="Cluster health status changed from [GREEN] to [RED] (reason: [{esdata09}{_gbguhWlS0OxedNrcCIUgQ}{Dxtoj4WYSjGBROPS9mpHcQ}{esdata09}{sw}{8.8.2} reason: {{esdata09}{_GBguhwLS0OxedNrcciugQ}{DXtoj4WysjGbrops9mphCq}{esdata09}{SW}{8.8.2} lagging, {esdata07}{HNptjVxORhe0mU--i0km6g}{l8kuExZ-Sz-okJdIltZP7Q}{esdata07}{sw}{8.8.2} reason: lagging])." previous.health="GREEN" reason="{esdata09}{_gbguhWlS0OxedNrcCIUgQ}{Dxtoj4WYSjGBROPS9mpHcQ}{esdata09}{sw}{8.8.2} reason: {esdata09}{_GBGuhwLS0OxedNrcciugQ}{DXtoj4WysjGbrops9mphCq}{esdata09}{SW}{8.8.2} lagging, {esdata07}{HNptjVxORhe0mU--i0km6g}{l8kuExZ-Sz-okJdIltZP7Q}{esdata07}{sw}{8.8.2} reason: lagging" logs are about 7 and 9 with no state delay, but we observe that master status is normal and there is no loss of contact.

You should never run a multi-node cluster with only one master-eligible node. In order to increase resiliency you should always look to have 3 master eligible nodes in a cluster.

Please share logs from the master node from around the time of the issue.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.