Last week, our cluster "162b24" had an issue where the hardware degraded on the machine and we needed it to be recovered to a snapshot. The conversation is here.
Since then, the machine has worked properly and then randomly spiked to 100% CPU every 3 days. I need to manually restart the cluster and it works again for a few days.
Nothing has changed on our end since except we cleaned out a bunch of .watch_history-* indexes as recommended.
Any help would be appreciated.