Suspect illegal state: trying to move shard from primary mode to replica mode

Hi all,

I have a 6 node cluster in production running ES 1.5.0, and I'm seeing these messages roughly 7 times a day across all machines in the cluster. I did some searching, and was not able to find an explanation of why this might be happening, and what I could do to fix it.

The shards that are in error are different across all machines, if that helps.

The one thread I found mentioned making sure that: discovery.zen.minimum_master_nodes is set appropriately, which mine is!

All other mentions of this error are for some really old versions of ES, so I'm at a bit of a loss.

Might anyone have a suggestion?
Many thanks,
Chris

Is anything else happening around that time, index creation or something else?

Thanks Mark.

Not index creation, but definitely document indexing is going on. No searches to speak of either. The index rate is pretty heavy, as I do see the "now throttling indexing" messages for other indexes, not the ones that are reported to be in illegal states.

Sorry that's not much to go on.
Chris

Can you paste a few of the logs lines that contain this, as well as the lines around it for context (if there's any).

Certainly!

Here's one:

[2015-06-18 10:47:28,695][INFO ][index.engine             ] [elasticsearch-bdprodes02] [derbysoft-agoda-20150618][0] stop throttling indexing: numMergesInFlight=2, maxNumMerges=3
[2015-06-18 10:47:38,475][INFO ][index.engine             ] [elasticsearch-bdprodes02] [derbysoft-carlson-20150618][1] now throttling indexing: numMergesInFlight=4, maxNumMerges=3
[2015-06-18 10:47:38,529][INFO ][index.engine             ] [elasticsearch-bdprodes02] [derbysoft-carlson-20150618][1] stop throttling indexing: numMergesInFlight=2, maxNumMerges=3
[2015-06-18 11:28:52,276][WARN ][index.shard              ] [elasticsearch-bdprodes02] [derbysoft-apache-20150612][1] suspect illegal state: trying to move shard from primary mode to replica mode
[2015-06-18 11:44:35,129][INFO ][index.engine             ] [elasticsearch-bdprodes02] [derbysoft-agoda-20150618][0] now throttling indexing: numMergesInFlight=4, maxNumMerges=3
[2015-06-18 11:44:35,653][INFO ][index.engine             ] [elasticsearch-bdprodes02] [derbysoft-agoda-20150618][0] stop throttling indexing: numMergesInFlight=2, maxNumMerges=3

and another:

[2015-06-17 00:40:55,260][INFO ][cluster.metadata         ] [elasticsearch-bdprodes01] [derbysoft-agoda-20150617] update_mapping [http_access_dwarf4agodaplugin] (dynamic)
[2015-06-17 00:49:02,716][WARN ][index.shard              ] [elasticsearch-bdprodes01] [derbysoft-apache-20150615][1] suspect illegal state: trying to move shard from primary mode to replica mode
[2015-06-17 00:52:19,905][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-bestwestern-20150617][0] now throttling indexing: numMergesInFlight=4, maxNumMerges=3

and one more:

[2015-06-17 07:47:28,336][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150616][1] stop throttling indexing: numMergesInFlight=2, maxNumMerges=3
[2015-06-17 07:47:29,305][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150616][1] now throttling indexing: numMergesInFlight=4, maxNumMerges=3
[2015-06-17 07:47:29,395][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150616][1] stop throttling indexing: numMergesInFlight=2, maxNumMerges=3
[2015-06-17 07:47:44,288][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150615][0] now throttling indexing: numMergesInFlight=4, maxNumMerges=3
[2015-06-17 07:47:44,357][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150615][0] stop throttling indexing: numMergesInFlight=2, maxNumMerges=3
[2015-06-17 07:48:06,648][WARN ][index.shard              ] [elasticsearch-bdprodes01] [derbysoft-apache-20150617][1] suspect illegal state: trying to move shard from primary mode to replica mode
[2015-06-17 07:57:50,387][INFO ][cluster.metadata         ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150617] update_mapping [hotel_avail_hiltonplugin4agoda] (dynamic)
[2015-06-17 07:58:52,801][INFO ][index.engine             ] [elasticsearch-bdprodes01] [derbysoft-hilton-20150617][1] now throttling indexing: numMergesInFlight=4, maxNumMerges=3

Thank you again.
Chris

https://github.com/elastic/elasticsearch/issues/11395 may be related.

Can you you try reindexing one of those indices to see if it helps? If it does we can then look at getting more information around why this is happening.

Absolutely.
Once I run the re-indexer, how to I test the new index to see if anything changed?

Chris

The errors for that index should stop, it's a bit of a "hammer for a bug" approach, but it's a start.

Just wanted to follow up on this. I spent several days dealing with some stability problems, and I'm getting back to this now. For whatever reason, these messages have stopped. I haven't had any in 5 days now, so I'm holding off on re-indexing things for now.

Thanks Mark for your help, and if these come back, I'll know what to try!
Chris