{"error":"RepositoryVerificationException[[esprod-usagetracking-2015-06] path is not accessible on master node]; nested: IOException[Filesystem closed]; ","status":500}
The above is the error received when running a cronjob every other morning. That's the strange part, it works once and then the next run it fails out. So I only have snapshots for every other day. The snapshots do work, but again, only every other run. When I run the script over and over it returns the error every other run.
Weird. It looks like the FileSystem object used underneath by the plugin is affected somehow. What distro are you using? The plugin never closes the FileSystem not does it keep on creating a new one; however other Hadoop clients running on the same machine might interfere with it as the FileSystem relies on an internal cache that can be affected.
Do you have a bigger stacktrace (potentially from Elasticsearch itself)?
Can you double check whether there are other jobs interfering with Hadoop every other day? Anything that sticks out from the Hadoop logs?
Do you restart Elasticsearch by any chance?
Running on CentOS 6.6. ES is the only thing running on this machine and is the only cronjob. ES is not restarted in between each snapshot creation, but it has been restarted before. Below is from the logs on 6/4:
[2015-06-04 08:45:12,011][INFO ][repositories ] [esprod00] update repository [esprod-usagetracking-2015-06]
[2015-06-04 08:45:12,068][WARN ][snapshots ] [esprod00] failed to create snapshot [esprod-usagetracking-2015-06:snapshot-2015-06-04]
org.elasticsearch.snapshots.SnapshotCreationException: [esprod-usagetracking-2015-06:snapshot-2015-06-04] failed to create snapshot
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.initializeSnapshot(BlobStoreRepository.java:260)
at org.elasticsearch.snapshots.SnapshotsService.beginSnapshot(SnapshotsService.java:278)
at org.elasticsearch.snapshots.SnapshotsService.access$600(SnapshotsService.java:88)
at org.elasticsearch.snapshots.SnapshotsService$1$1.run(SnapshotsService.java:204)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1448)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1390)
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:394)
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:390)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:390)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:334)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:849)
at org.elasticsearch.hadoop.hdfs.blobstore.HdfsBlobContainer.createOutput(HdfsBlobContainer.java:71)
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.initializeSnapshot(BlobStoreRepository.java:235)
@tgreischel Hi
I've pushed a couple of updates which hopefully should fix your problem.
the FileSystem instance is checked to see whether it's alive or not, so in case it is closed, a new one will be created.
instead of using the typical API which relies on some Hadoop client caching (which can cause the FileSystem to be closed by other clients), a dedicated, private instance is now created instead which should be managed just by the plugin itself (though there is a shutdown hook that might close it, however see #1).
I have pushed a new dev build already in the repository - can you please try it out and let me know how it works for you. You shouldn't get the exception any more even for subsequent builds.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.