The only reason we can imagine is that you were using dedicated master nodes. In that case, the _state dir does not live within data nodes but in master nodes.
Can you confirm you are using dedicated master nodes?
I'm reading the source code again and I don't understand how this could happen.
final Path stateDir = dataLocation.resolve(STATE_DIR_NAME);
// now, iterate over the current versions, and find latest one
// we don't check if the stateDir is present since it could be deleted
// after the check. Also if there is a _state file and it's not a dir something is really wrong
try (DirectoryStream<Path> paths = Files.newDirectoryStream(stateDir)) { // we don't pass a glob since we need the group part for parsing
for (Path stateFile : paths) {
final Matcher matcher = stateFilePattern.matcher(stateFile.getFileName().toString());
if (matcher.matches()) {
final long stateId = Long.parseLong(matcher.group(1));
maxStateId = Math.max(maxStateId, stateId);
final boolean legacy = MetaDataStateFormat.STATE_FILE_EXTENSION.equals(matcher.group(2)) == false;
maxStateIdIsLegacy &= legacy; // on purpose, see NOTE below
PathAndStateId pav = new PathAndStateId(stateFile, stateId, legacy);
logger.trace("found state file: {}", pav);
files.add(pav);
}
}
} catch (NoSuchFileException | FileNotFoundException ex) {
// no _state directory -- move on
}
stateDir here is _state. We check for files like state-XX.st which sounds to exist here. So logger.trace("found state file: {}", pav); should print it...
Are those files readable by the elasticsearch user which is running the elasticsearch process?
What gives ls -l /index_53/1/_state for example? And what gives the same command for an index which is loaded correctly by elasticsearch?
state-29290.st file seems to be super small indeed.
Also I can see that you first post happened in Dec 1st. The date of this latest .st file is 30. Nov 13:41.
Any chance you recall what you did before this date?
I think you tried to rollback to 1.7 at this period, right?
Yes, as mentioned in the first posts i updated, got some errors referring to the mapping change.
after witing for a while (one or two hours) and no change happened it shut down everything again and deployed the old ES 1.7.1 and started up again, and finding all data lost.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.