Recover corrupt/not working elasticsearch data

Hi!

So I'm really not sure what caused this. We have an instance of Elasticsearch and Kibana running on one of our computers at the office. After we had to move desks around, the whole thing stopped working. Elasticsearch seems to be running fine, but running sudo service kibana status to check the status of Kibana, I got this:

https://pastebin.com/UFmqXdCY

So I get some sort of search_phase_execution_exception. I get the same error when I try to reindex the data to a new server. I read somewhere that I might get more information about not working indices by running

curl -X GET "localhost:9200/_cluster/allocation/explain" -H 'Content-Type: application/json' -d'
{
  "index": "cpi_modules_shipped",
  "shard": 0,
  "primary": true
}
'

The result I get is:
https://pastebin.com/ChyNMxtT

Not sure what to try next!

UPDATE:
Tried running

curl -X POST 'localhost:9200/_cluster/reroute?retry_failed=true'

As this was suggested by the previous error message. This returned the following super long message. It is complaining about device or resource busy. This is the same error I kept getting when I tried to scp the index data manually from the server.

https://pastebin.com/Swh0JLBz

There's something rather wrong with the storage system (or filesystem) on this node: the files that make up the translog seem to be inaccessible. I would suggest shutting down Elasticsearch on this node and checking some lower-level things (filesystem, disks, kernel messages etc).

Is it part of a cluster, and are there replicas, or is this just a standalone node? If it's standalone, can you restore it from a snapshot?

Hi! Thanks for the fast reply.

This is a standalone node. It is hosted on Ubuntu through Virtual Box. I have never made any snapshots, so no I guess I can't :confused:. I can try and check that stuff, but not really sure what I'm looking for...

Is it btw possible to move the data from one server to another if I manage to copy the data folder?

Yes, that is possible, as long as you've shut Elasticsearch down before starting to take the copy. You should be careful never to use the old data folder again.

device or resource busy could perhaps indicate simply something else is still using a file. If you can't work out what, it's possible that shutting down the Ubuntu VM and then rebooting the host will get rid of it. OTOH this might make it worse, so proceed at your own risk :slight_smile:

Well, you did warn me. I restarted using sudo reboot. Guess what? The old location for data and logs are gone! Really not sure how this could have happened. I tried starting the service, but got an error saying that elasticsearch was unable to create the logs directory:

-- Logs begin at Wed 2019-03-13 14:29:32 CET, end at Wed 2019-03-13 14:42:25 CET. --
Mar 13 14:29:33 kpidb systemd[1]: Started Elasticsearch.
Mar 13 14:29:38 kpidb elasticsearch[307]: 2019-03-13 14:29:38,323 main ERROR Unable to create file /nas/elastic/elasticsearch/logs/acconeer-kpis.log java.io.IOException: Could not create directory /nas/elastic/e
lasticsearch/logs
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.util.FileUtils.mkdir(FileUtils.java:127)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.util.FileUtils.makeParentDirs(FileUtils.java:144)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.rolling.RollingFileManager$RollingFileManagerFactory.createManager(RollingFileManager.java:627)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.rolling.RollingFileManager$RollingFileManagerFactory.createManager(RollingFileManager.java:608)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.AbstractManager.getManager(AbstractManager.java:113)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.OutputStreamManager.getManager(OutputStreamManager.java:114)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.rolling.RollingFileManager.getFileManager(RollingFileManager.java:188)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.RollingFileAppender$Builder.build(RollingFileAppender.java:145)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.appender.RollingFileAppender$Builder.build(RollingFileAppender.java:61)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:123)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:959)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:899)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:891)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:514)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:238)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:250)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:547)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:263)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.common.logging.LogConfigurator.configure(LogConfigurator.java:234)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.common.logging.LogConfigurator.configure(LogConfigurator.java:127)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:302)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:127)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.cli.Command.main(Command.java:90)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93)
Mar 13 14:29:38 kpidb elasticsearch[307]:         at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:86)
Mar 13 14:29:38 kpidb elasticsearch[307]: 2019-03-13 14:29:38,367 main ERROR Could not create plugin of type class org.apache.logging.log4j.core.appender.RollingFileAppender for element RollingFile: java.lang.Il
legalStateException: ManagerFactory [org.apache.logging.log4j.core.appender.rolling.RollingFileManager$RollingFileManagerFactory@37883b97] unable to create manager for [/nas/elastic/elasticsearch/logs/acconeer-k
pis.log] with data [org.apache.logging.log4j.core.appender.rolling.RollingFileManager$FactoryData@6ab778a[pattern=/nas/elastic/elasticsearch/logs/acconeer-kpis-%d{yyyy-MM-dd}-%i.log.gz, append=true, bufferedIO=t
rue, bufferSize=8192, policy=CompositeTriggeringPolicy(policies=[TimeBasedTriggeringPolicy(nextRolloverMillis=0, interval=1, modulate=true), SizeBasedTriggeringPolicy(size=134217728)]), strategy=DefaultRolloverS
trategy(min=-2147483648, max=2147483647, useMax=false), advertiseURI=null, layout=[%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %.-10000m%n, filePermissions=null, fileOwner=null]] java.lang.IllegalStateExce
ption: ManagerFactory [org.apache.logging.log4j.core.appender.rolling.RollingFileManager$RollingFileManagerFactory@37883b97] unable to create manager for [/nas/elastic/elasticsearch/logs/acconeer-kpis.log] with
data [org.apache.logging.log4j.core.appender.rolling.RollingFileManager$FactoryData@6ab778a[pattern=/nas/elastic/elasticsearch/logs/acconeer-kpis-%d{yyyy-MM-dd}-%i.log.gz, append=true, bufferedIO=true, bufferSiz
e=8192, policy=CompositeTriggeringPolicy(policies=[TimeBasedTriggeringPolicy(nextRolloverMillis=0, interval=1, modulate=true), SizeBasedTriggeringPolicy(size=134217728)]), strategy=DefaultRolloverStrategy(min=-2
147483648, max=2147483647, useMax=false), advertiseURI=null, layout=[%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %.-10000m%n, filePermissions=null, fileOwner=null]]

UPDATE:
Changed the permissions to the folder. Starting elasticsearch seems to work now. Data seems to be gone though. What the ****

Oh dear.

I would suspect some combination of hardware issues and/or uncleanly restarting the VM leaving the filesystem in an inconsistent state.

Is Elasticsearch reporting any useful error messages?

Sigh. I can't see anything useful in the logs. Journalctl just doesn't log anything between me rebooting the server and then starting Elasticsearch. Both Kibana and Elasticsearch are running smoothly now, like nothing happened. I can't understand why simply restarting the server would cleanly wipe by logs, data and backup folder...

-- Logs begin at Wed 2019-03-13 14:29:32 CET, end at Wed 2019-03-13 14:58:57 CET. --
Mar 13 14:46:13 kpidb systemd[1]: Started Elasticsearch.
Mar 13 14:39:20 kpidb systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
Mar 13 14:39:20 kpidb systemd[1]: elasticsearch.service: Unit entered failed state.
Mar 13 14:39:20 kpidb systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Mar 13 14:39:20 kpidb elasticsearch[603]: 2019-03-13 14:39:20,068 main ERROR Unable to locate appender "deprecation_rolling" for logger config "org.elasticsearch.deprecation"
Mar 13 14:39:20 kpidb elasticsearch[603]: 2019-03-13 14:39:20,068 main ERROR Unable to locate appender "deprecated_audit_rolling" for logger config "org.elasticsearch.xpack.security.audit.logfile.DeprecatedLoggi
ngAuditTrail"
Mar 13 14:39:20 kpidb elasticsearch[603]: 2019-03-13 14:39:20,068 main ERROR Unable to locate appender "index_search_slowlog_rolling" for logger config "index.search.slowlog"
Mar 13 14:39:20 kpidb elasticsearch[603]: 2019-03-13 14:39:20,067 main ERROR Unable to locate appender "audit_rolling" for logger config "org.elasticsearch.xpack.security.audit.logfile.LoggingAuditTrail"
Mar 13 14:39:20 kpidb elasticsearch[603]: 2019-03-13 14:39:20,067 main ERROR Unable to locate appender "index_indexing_slowlog_rolling" for logger config "index.indexing.slowlog.index"
Mar 13 14:39:20 kpidb elasticsearch[603]: 2019-03-13 14:39:20,067 main ERROR Unable to locate appender "rolling" for logger config "root"

This is the log for Kibana, right after I started the server. I mean, the data was gone right after the reboot, so it should not really be elastic or kibana's fault?

https://pastebin.com/VMPA9LU7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.