Today, our ElasticSearch cluster stopped responding. We were not performing any sort of maintenance or making any changes.
Any request to the cluster caused it to respond with:
"Get "https://X.X.X.X:X/": http: server gave HTTP response to HTTPS client".
(IP Address filtered out for privacy, just in case)
Checking the logs, I see that right before the crash, the system logged:
[instance-0000000030] reloaded [/app/config/node.crt] and updated ssl contexts using this file
We performed a rolling restart, and the cluster started responding correctly as the nodes restarted.
This cluster is on 6.X still, and has been running fine for many years. I've never seen anything like this before. Anyone have any insight on what may have happened, why this may have happened? I've searched around, but never seen anyone talk about anything like this.
I just heard back from the ElasticCloud support team that this issue did affect multiple customers today, and wasn't unique to my instance. They've updated the status page with this issue (502 Bad Gateway).
I'll leave this issue here, in case other customers have had this issue today, to let them know that that this was identified by the ElasticCloud team as an issue affecting multiple customers.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.