We upgraded auditbeats from version 8.6.2 to version 8.10.2 using automation tooling. The upgrade was first tested on some of our EC2 instances, and had no issues. When we applied the upgrade to a subset of physical hosts in the datacenter, they ALL crashed and rebooted. All host are running CentOS 7, kernel version 3.10.0-1160.99.1.el7.x86_64 on Dell PowerEdge R430 servers. The host would reboot, come up to a login prompt, remain up for about a minute, and then crash and reboot again. A check of the syslog showed NO messages to indicate root cause, there are no messages in syslog between the end of the last boot and the journal message from the start of the next boot. To fix the issue, we had to disable the auditbeats service with systemctl, and wait for the server to crash and reboot one last time, then revert the upgrade.
A few of the hosts were set to prompt on error, and the BIOS reported that the system had been rebooted due to a watchdog timeout.
We would like to know the root cause of this issue.