We are running version 6.8.3 with a 3.2.2 (master.client.data) setup. Our cluster is currently in the reds state:
curl -X GET '$HOST/_cat/health'
1575572989 19:09:49 es-cluster-dogfood red 6 1 1050 1050 0 0 1116 0 - 48.5%
We are seeing some weird errors related to one of our index alias's from an ILM Policy. One of our data nodes is crash looping with the other node running fine. Because of this, the second data node is unable to join the cluster and none of the secondary index shards are able to assigned. This is the error from the logs on that data node:
org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.8.3.jar:6.8.3] at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.8.3.jar:6.8.3] Caused by: java.lang.IllegalStateException: alias [counter-alias] has more than one write index [counter-000014,counter-000013] at org.elasticsearch.cluster.metadata.AliasOrIndex$Alias.computeAndValidateWriteIndex(AliasOrIndex.java:173) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cluster.metadata.MetaData$Builder.lambda$buildAliasAndIndexLookup$1(MetaData.java:1158) ~[elasticsearch-6.8.3.jar:6.8.3] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?] at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?] at java.util.TreeMap$ValueSpliterator.forEachRemaining(TreeMap.java:2890) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?] at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?] at org.elasticsearch.cluster.metadata.MetaData$Builder.buildAliasAndIndexLookup(MetaData.java:1158) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1122) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:73) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:88) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.node.Node.<init>(Node.java:499) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.3.jar:6.8.3] ... 6 more
The important part of this error is:
alias [counter-alias] has more than one write index [counter-000014,counter-000013]
When i look at that alias though, it shows that only one of those indices are write enabled:
curl -X GET '$HOST/_alias/counter-alias?pretty'
{
...
"counter-000013" : {
"aliases" : {
"counter-alias" : {
"is_write_index" : false
}
}
},
...
"counter-000014" : {
"aliases" : {
"counter-alias" : {
"is_write_index" : true
}
},
...
}
It seems to me like the state of the alias on the 2 data nodes might be inconsistent? And somehow the second data node has two indices write enabled?
Thanks in advance for the help.