I have multiple environments on Azure. Each of the env. has multiple node elasticsearch cluster.
everything was working fine for months and now suddenly ES snapshot stopped worked with weird error. Error during snapshot is below
{
"error" : {
"root_cause" : [
{
"type" : "no_such_element_exception",
"reason" : "An error occurred while enumerating the result, check the original exception for details."
}
],
"type" : "no_such_element_exception",
"reason" : "An error occurred while enumerating the result, check the original exception for details.",
"caused_by" : {
"type" : "storage_exception",
"reason" : "The specified account does not exist.",
"caused_by" : {
"type" : "null_pointer_exception",
"reason" : null
}
}
},
"status" : 500
}
I verified all permissions and key are same. There is no change in any Blob permission or so. I tried even restarting ES master node but no luck.
If I run blob upload command from same server to same Blob, it works fine - this proves there is no permission issue.
I enabled debugging of Azure ES snapshot plugin to see actual error but error is not much meaningful
ES logs says
[2019-09-29T05:07:09,971][WARN ][r.suppressed ] [afqLC7_] path: /_snapshot/esbackuprepository/data01, params: {pretty=, repository=esbackuprepository, snapshot=data01} java.util.NoSuchElementException: An error occurred while enumerating the result, check the original exception for details. at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySegmentedIterator.java:113) ~[?:?] at org.elasticsearch.cloud.azure.storage.AzureStorageService.lambda$listBlobsByPrefix$14(AzureStorageService.java:227) ~[?:?] at org.elasticsearch.cloud.azure.blobstore.util.SocketAccess.lambda$doPrivilegedVoidException$0(SocketAccess.java:64) ~[?:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_201] at org.elasticsearch.cloud.azure.blobstore.util.SocketAccess.doPrivilegedVoidException(SocketAccess.java:63) ~[?:?] at org.elasticsearch.cloud.azure.storage.AzureStorageService.listBlobsByPrefix(AzureStorageService.java:226) ~[?:?] at org.elasticsearch.cloud.azure.blobstore.AzureBlobStore.listBlobsByPrefix(AzureBlobStore.java:119) ~[?:?] at org.elasticsearch.cloud.azure.blobstore.AzureBlobContainer.listBlobsByPrefix(AzureBlobContainer.java:120) ~[?:?] at org.elasticsearch.repositories.blobstore.BlobStoreRepository.listBlobsToGetLatestIndexId(BlobStoreRepository.java:822) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.repositories.blobstore.BlobStoreRepository.latestIndexBlobId(BlobStoreRepository.java:800) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getRepositoryData(BlobStoreRepository.java:663) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.snapshots.SnapshotsService.createSnapshot(SnapshotsService.java:235) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.action.admin.cluster.snapshots.create.TransportCreateSnapshotAction.masterOperation(TransportCreateSnapshotAction.java:83) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.action.admin.cluster.snapshots.create.TransportCreateSnapshotAction.masterOperation(TransportCreateSnapshotAction.java:41) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:108) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.doRun(TransportMasterNodeAction.java:195) ~[elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.5.4.jar:6.5.4] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.5.4.jar:6.5.4] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] Caused by: com.microsoft.azure.storage.StorageException: The specified account does not exist. at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87) ~[?:?] at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:209) ~[?:?] at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySegmentedIterator.java:109) ~[?:?] ... 20 more Caused by: java.lang.NullPointerException at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:189) ~[?:?] at com.microsoft.azure.storage.core.LazySegmentedIterator.hasNext(LazySegmentedIterator.java:109) ~[?:?] ... 20 more
Even changing backup container is failing with message
{ "error": { "root_cause": [ { "type": "repository_verification_exception", "reason": "[esbackuprepository] path [engg-az-dev2] is not accessible on master node" } ], "type": "repository_verification_exception", "reason": "[esbackuprepository] path [engg-az-dev2] is not accessible on master node", "caused_by": { "type": "i_o_exception", "reason": "Can not write blob master.dat", "caused_by": { "type": "storage_exception", "reason": "The specified account does not exist.", "caused_by": { "type": "null_pointer_exception", "reason": null } } } }, "status": 500 }
Any thoughts?