Recovering from missing state .si file

samuel · April 25, 2020, 5:48am

Elasticsearch service has been terminated and fails to restart. The error returned is

Job for elasticsearch.service failed because the control process exited with error code.
See "systemctl status elasticsearch.service" and "journalctl -xe" for details.

The log says
org.elasticsearch.bootstrap.StartupException: ElasticsearchException[failed to bind service]; nested: CorruptIndexException[Unexpected file read error while reading index. (resource=BufferedChecksumIndexInput(SimpleFSIndexInput(path="/var/lib/elasticsearch/nodes/0/_state/segments_11mz")))]; nested: NoSuchFileException[/var/lib/elasticsearch/nodes/0/_state/_kfz.si];

I can confirm that the file /var/lib/elasticsearch/nodes/0/_state/_kfz.si is indeed missing.

How can I recover Elasticsearch to a state where I can restart it?

DavidTurner · April 25, 2020, 6:43am

Unfortunately that file is essential to Elasticsearch. How did you get it into this state?

Assuming you don't have a copy of this file elsewhere, your best bet is to wipe this node and start again, allowing Elasticsearch to recover any missing shards from the other nodes in the cluster. Alternatively you can restore from a recent snapshot.

samuel · April 25, 2020, 11:24am

Thanks for the response. I haven't been able to identify the root cause leading to the missing file. Elasticsearch has been running on a server without me doing anything.

The cluster only has one node but I have a snapshot from a few days back. What's the recommended procedure to "wipe this node"? Unistall Elasticsearch, remove /var/lib/elasticsearch, reinstall Elasticsearch and then restore from snapshot? (I'm running Debian and using apt-get to install ES if it matters.)

DavidTurner · April 25, 2020, 11:37am

According to the log message your data path is /var/lib/elasticsearch, which means it should be enough to delete the contents of that directory and start Elasticsearch up again. No need to uninstall/reinstall anything AFAIK.

samuel · April 25, 2020, 7:48pm

Thanks a lot!

After wiping var/lib/elasticsearch I had to reset the password for the elastic user but I managed.

The next issue is that I either don't know the name of my snapshot or there's still something missing. When I try to run

curl -XPOST -u elastic  localhost:9200/_snapshot/$MY_BACKUP/$MY_SNAPSHOT/_restore

with various versions of $MY_BACKUP and $MY_SNAPSHOT I always get the response repository_missing_exception. I can access my snapshot folder but don't know how to fetch the backup and snapshot names.

DavidTurner · April 25, 2020, 9:25pm

Did you register the repository again? If not, you'll need to do that. You can list the currently-registered repositories with GET _snapshot/_all, and list the snapshots within a repository called $REPOSITORY_NAME using GET /_snapshot/$REPOSITORY_NAME/_all.

samuel · May 1, 2020, 7:48am

Thanks again!

I was able to restore a snapshot.

(I also added another node to my cluster and I'm in progress of adding a third one.)

By the way, am I supposed to find documentation of _snapshot under https://www.elastic.co/guide/en/elasticsearch/reference/7.6/rest-apis.html

(I can find https://www.elastic.co/guide/en/elasticsearch/reference/7.6/snapshot-restore.html, https://www.elastic.co/guide/en/elasticsearch/reference/7.6/snapshots-register-repository.html and https://www.elastic.co/guide/en/elasticsearch/reference/7.6/snapshots-take-snapshot.html but under the REST APIs I can only fing snapshot lifecycle management documentation...)

DavidTurner · May 1, 2020, 8:53am

That's probably not deliberate, the structure of the reference manual is undergoing some big improvements at the moment so there are some inconsistencies in exactly how and where things are documented. I opened https://github.com/elastic/elasticsearch/issues/56069 in case that omission isn't tracked elsewhere.

system · May 29, 2020, 8:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch service won't start, nested elasticsearch node state exception Elasticsearch	4	3571	December 11, 2020
CorruptIndexException missing .si file Elasticsearch	5	2023	July 12, 2020
SOLVED - ELASTICSEARCH - Unable to start elastic search service on linux Elasticsearch	13	4747	July 5, 2017
Accidentally deleted Index- Error: failed to find metadata for existing index Elasticsearch	10	4624	October 2, 2019
CorruptIndexException after node restart Elasticsearch	5	1033	September 26, 2017

Recovering from missing state .si file

Related topics