We are preparing to update our server ES version from 1.4.4 to 1.7.3. We had an understanding in talking to various people that this should be relatively straightforward and low risk, if accomplished via a cluster rolling restart. We did the update in our (much smaller) testing environment, and the cluster state went red for several minutes, which gives us pause in preparing to do the same in production.
Here are the relevant log lines from the first node, in testing, that was restarted, after which time it went into a red state for around 15 minutes.
The worrying bit to me is the ElasticsearchIllegalStateException. Our production cluster has 3923 total shards running on 20 nodes, with 85 TB of data. We had been planning to accomplish the update by a rolling restart of the cluster. But I wanted to make sure we weren't setting ourselves up for a long downtime while this process happens. Any insight is appreciated.