IllegalStateException: Alias has more than one write index

We are running version 6.8.3 with a 3.2.2 (master.client.data) setup. Our cluster is currently in the reds state:

curl -X GET '$HOST/_cat/health'
1575572989 19:09:49 es-cluster-dogfood red 6 1 1050 1050 0 0 1116 0 - 48.5%

We are seeing some weird errors related to one of our index alias's from an ILM Policy. One of our data nodes is crash looping with the other node running fine. Because of this, the second data node is unable to join the cluster and none of the secondary index shards are able to assigned. This is the error from the logs on that data node:

org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.8.3.jar:6.8.3] at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.8.3.jar:6.8.3] Caused by: java.lang.IllegalStateException: alias [counter-alias] has more than one write index [counter-000014,counter-000013] at org.elasticsearch.cluster.metadata.AliasOrIndex$Alias.computeAndValidateWriteIndex(AliasOrIndex.java:173) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cluster.metadata.MetaData$Builder.lambda$buildAliasAndIndexLookup$1(MetaData.java:1158) ~[elasticsearch-6.8.3.jar:6.8.3] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?] at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177) ~[?:?] at java.util.TreeMap$ValueSpliterator.forEachRemaining(TreeMap.java:2890) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?] at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?] at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?] at org.elasticsearch.cluster.metadata.MetaData$Builder.buildAliasAndIndexLookup(MetaData.java:1158) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1122) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:73) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:88) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.node.Node.<init>(Node.java:499) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.3.jar:6.8.3] at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.3.jar:6.8.3] ... 6 more

The important part of this error is:
alias [counter-alias] has more than one write index [counter-000014,counter-000013]

When i look at that alias though, it shows that only one of those indices are write enabled:

curl -X GET '$HOST/_alias/counter-alias?pretty'
{
  ...
  "counter-000013" : {
    "aliases" : {
      "counter-alias" : {
        "is_write_index" : false
      }
    }
  },
  ...
  "counter-000014" : {
    "aliases" : {
      "counter-alias" : {
        "is_write_index" : true
      }
    },
   ...
}

It seems to me like the state of the alias on the 2 data nodes might be inconsistent? And somehow the second data node has two indices write enabled?

Thanks in advance for the help.

ping?

One more ping?

We'd like to try to root cause this, but if we get no reply we can just nuke the bad data node.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.