Failed to find metadata for existing index after node restart

Recently we upgraded from 6.7.1 to 7.2.0. After the upgrade, if a node leaves the cluster it sometimes runs into the below error. The only way we have found to fix this issue is delete the data directly and lose any data on that node.

[2019-07-21T08:07:13,264][ERROR][o.e.g.GatewayMetaState   ] [dcmipvmnsm003] failed to     read or upgrade local state, exiting...
java.io.IOException: failed to find metadata for existing index alert-2019.04.04 [location: glGVcnWpRg2PhzxB--5WAw, generation: 12]
	at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:99) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:148) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:102) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.node.Node.<init>(Node.java:473) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.node.Node.<init>(Node.java:251) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:221) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:221) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:349) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) [elasticsearch-cli-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.Command.main(Command.java:90) [elasticsearch-cli-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) [elasticsearch-7.2.0.jar:7.2.0]
[2019-07-21T08:07:13,270][ERROR][o.e.b.Bootstrap          ] [dcmipvmnsm003] Exception
org.elasticsearch.ElasticsearchException: failed to bind service
	at org.elasticsearch.node.Node.<init>(Node.java:580) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.node.Node.<init>(Node.java:251) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:221) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:221) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:349) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) [elasticsearch-cli-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.Command.main(Command.java:90) [elasticsearch-cli-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115) [elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) [elasticsearch-7.2.0.jar:7.2.0]
Caused by: java.io.IOException: failed to find metadata for existing index alert-2019.04.04 [location: glGVcnWpRg2PhzxB--5WAw, generation: 12]
	at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:99) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:148) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:102) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.node.Node.<init>(Node.java:473) ~[elasticsearch-7.2.0.jar:7.2.0]
	... 11 more
[2019-07-21T08:07:13,273][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [dcmipvmnsm003] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: IOException[failed to find metadata for existing index alert-2019.04.04 [location: glGVcnWpRg2PhzxB--5WAw, generation: 12]];
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-7.2.0.jar:7.2.0]
	at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-7.2.0.jar:7.2.0]
Caused by: org.elasticsearch.ElasticsearchException: failed to bind service
	at org.elasticsearch.node.Node.<init>(Node.java:580) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.node.Node.<init>(Node.java:251) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:221) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:221) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:349) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-7.2.0.jar:7.2.0]
	... 6 more
Caused by: java.io.IOException: failed to find metadata for existing index alert-2019.04.04 [location: glGVcnWpRg2PhzxB--5WAw, generation: 12]
	at org.elasticsearch.gateway.MetaStateService.loadFullState(MetaStateService.java:99) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.gateway.GatewayMetaState.upgradeMetaData(GatewayMetaState.java:148) ~[elasticsearch-7.2.0.jar:7.2.0]
	at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:102) ~

Hi @Vertigo and welcome!

This error indicates that some of the data in the data path is unexpectedly going missing during the restart. Elasticsearch 7 has stricter consistency checks on the contents of the data path to detect this kind of loss compared to earlier versions. You need to make sure that everything in the data path remains intact across a restart.

Ok, is there a way to ignore the inconsistency or move past this error, or Just delete that shard that is having issues?

Also, how do we ensure that the data path remains intact? The only service that has access to that directory is elasticsearch.

Thanks!

No, ignoring this problem is a very bad idea. If this data is unexpectedly going missing then who knows what else is going wrong?

To be clear, Elasticsearch does not get into this broken state on its own. There's something else involved too. Can you share the full content of your elasticsearch.yml? Avoid redacting as much as you can, and identify any redactions as clearly as possible.

Here is our config:

cluster.name: nsm
node.name: dcwipvmnsm015
path.data: "/data/nsm/elasticsearch"
path.logs: "/data/nsm/log/elasticsearch"
network.host: 0.0.0.0
http.port: '9200'
transport.tcp.port: '9300'
discovery.seed_hosts:
- dcwepscpb025-p.edc.nam.gm.com:9304
- dcwepscpb024-p.edc.nam.gm.com:9304
- dcwepscpb011-p.edc.nam.gm.com:9304
discovery.zen.minimum_master_nodes: '2'
cluster.join.timeout: '120s'
node.ml: false
node.data: true
node.ingest: true
node.master: false

xpack.security.http.ssl.key: /etc/elasticsearch/certs/host.key
xpack.security.http.ssl.certificate: /etc/elasticsearch/certs/host.pem
xpack.security.http.ssl.certificate_authorities: /etc/elasticsearch/certs/chain.pem
xpack.security.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.verification_mode: none
xpack.security.transport.ssl.key: /etc/elasticsearch/certs/host.key
xpack.security.transport.ssl.certificate: /etc/elasticsearch/certs/host.pem
xpack.security.transport.ssl.certificate_authorities: /etc/elasticsearch/certs/chain.pem

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: none
xpack.watcher.index.rest.direct_access: true
xpack.security.transport.ssl.supported_protocols: [ "TLSv1.2", "TLSv1.1"]
xpack.security.http.ssl.supported_protocols: [ "TLSv1.2", "TLSv1.1"]

Thanks. I don't see anything in your config that'd make this error more likely. It really does look like some files are going missing from /data/nsm/elasticsearch outside of Elasticsearch's control.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.