[RESOLVED] Sporadically failing snapshots

Hi,

We are using Elasticsearch 6.3.x deployed to Elastic Cloud which is configured against GCS (Europe West 1). About a week ago, some of the snapshots started to fail for 1-2 indices (out of ~500) with the warning in the log similar to that:

com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113) ~[?:?] at
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40) ~[?:?] at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432) ~[?:?] at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352) ~[?:?] at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) ~[?:?] at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.lambda$writeBlob$5(GoogleCloudStorageBlobStore.java:213) ~[?:?] at org.elasticsearch.repositories.gcs.SocketAccess.lambda$doPrivilegedVoidIOException$0(SocketAccess.java:54) ~[?:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_144] at org.elasticsearch.repositories.gcs.SocketAccess.doPrivilegedVoidIOException(SocketAccess.java:53) ~[?:?] at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.writeBlob(GoogleCloudStorageBlobStore.java:207) ~[?:?] at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobContainer.writeBlob(GoogleCloudStorageBlobContainer.java:72) ~[?:?] at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshotFile(BlobStoreRepository.java:1281) ~[elasticsearch-6.3.2.jar:6.3.2] at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshot(BlobStoreRepository.java:1217) ~[elasticsearch-6.3.2.jar:6.3.2] ... 9 more

The issue affects both 6.3.1 and 6.3.2. Does anybody else see the same issue?

Any idea what to do about it is very much welcome.

OK, looks like I am not the only one and Elastic Cloud is looking at it, see Elastic Cloud Status - Incident "Increased cluster snapshot failures on GCP regions"

The issue seems to be fixed around midnight (UTC) on August 17, 2018.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.