Mistakenly upgraded without reindex or assistant. Now I cannot fix it

We upgraded from 6.4 to 7.4 thinking we didn't have any 5.x indices. It turns out we did. Elasticsearch wouldn't start (by design, documented here).

We tried to roll back to 6.4 so we could start elasticsearch to delete the offending indices, but then we ran into this problem:

org.elasticsearch.ElasticsearchException: java.io.IOException: failed to read [id:66, file:/data/elasticsearch/my_logs/nodes/0/_state/node-66.st]
....
Caused by: org.elasticsearch.common.xcontent.XContentParseException: [-1:36] [node_meta_data] unknown field [node_version], parser not found

That seems to be backed up by a forum post here.

So now I don't know what to do. I can't start elasticsearch in 7.x because of broken indices, and I can't start it in 6.4 because of the error above. Is there any way out of this situation, or did I just lose my entire repo?

That would be... very very bad.

If you took a snapshot of your indices prior to the upgrade (which is highly recommended before any upgrade) you could simply make a clean install of the 6.4 cluster, start it up and restore the snapshot. When all indices have been restored and the cluster is green, delete the 5.x indices and start the upgrade again.

No snapshot sadly (due to space issues). We did the best we could to mitigate risks by testing in dev and staging environments first, but neither had old indices so we didn't see the problem until it happened.

Any other approaches?

Do you have a copy of the exception and stack trace that the 7.x nodes emitted when they refused to start up? If they noticed the problem before they got too far through the upgrade then it might be recoverable.

I have the stacktrace for each stage of my stumbling and bumbling. Which one are you interested in specifically?

Share as much as you can! Use https://gist.github.com since it likely won't fit here.

@DavidTurner here you go

Ok, good, this node stopped in GatewayMetaState.upgradeMetaData which looks promising. Is there a file matching /data/elasticsearch/my_logs/nodes/0/_state/manifest-*.st?

No manifest-*.st Two files present: global-129.st AND node-68.st

There aren't many circumstances under which it's ok to manipulate the contents of the data path by hand, but you are in luck that you were upgrading from 6.4 to 7.4.2 and the upgrade stopped at the right place. You should be ok to remove /data/elasticsearch/my_logs/nodes/0/indices/kDe4rckLScex-zgXJNeOyQ/ which will get rid of the problematic logstash-2018.08.17 index.

Thanks David. We went through our logs using the strategy you listed and got things up and running.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.