Snapshot Deletion Takes a very long time

Hello,

We have snapshots for ES 7.5.2 cluster in Google Cloud Storage(GCS). We have a lot of snapshots going almost a year back and now we are trying to delete the snapshots and keep only last 90 days of data. In the process, it takes 8+ hours deletion of one snapshot and we have a thousands to go through.

I see the task still running and this is the hot_threads output for the node where the task is running:

       25.4% (126.7ms out of 500ms) cpu usage by thread 'elasticsearch[elasticsearch-master-2][snapshot][T#2384]'
         10/10 snapshots sharing following 59 elements
           java.base@13.0.1/sun.nio.ch.Net.poll(Native Method)
           java.base@13.0.1/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:182)
           java.base@13.0.1/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:286)
           java.base@13.0.1/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:310)
           java.base@13.0.1/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:351)
           java.base@13.0.1/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:802)
           java.base@13.0.1/java.net.Socket$SocketInputStream.read(Socket.java:937)
           java.base@13.0.1/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:450)
           java.base@13.0.1/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
           java.base@13.0.1/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1409)
           java.base@13.0.1/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1022)
           java.base@13.0.1/java.io.BufferedInputStream.fill(BufferedInputStream.java:245)
           java.base@13.0.1/java.io.BufferedInputStream.read1(BufferedInputStream.java:285)
           java.base@13.0.1/java.io.BufferedInputStream.read(BufferedInputStream.java:344)
           java.base@13.0.1/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:746)
           java.base@13.0.1/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
           java.base@13.0.1/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1610)
           java.base@13.0.1/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1515)
           java.base@13.0.1/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)
           java.base@13.0.1/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:308)
           com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:36)
           com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:144)
           com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:79)
           com.google.api.client.http.HttpRequest.execute(HttpRequest.java:995)
           com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:515)
           com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:448)
           com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
           com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:358)
           com.google.cloud.storage.StorageImpl$8.call(StorageImpl.java:355)
           com.google.cloud.storage.StorageImpl$8.call(StorageImpl.java:352)
           com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
           com.google.cloud.RetryHelper.run(RetryHelper.java:76)
           com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
           com.google.cloud.storage.StorageImpl.listBlobs(StorageImpl.java:351)
           com.google.cloud.storage.StorageImpl.access$200(StorageImpl.java:82)
           com.google.cloud.storage.StorageImpl$BlobPageFetcher.getNextPage(StorageImpl.java:296)
           com.google.cloud.PageImpl.getNextPage(PageImpl.java:116)
           com.google.cloud.PageImpl$PageIterator.computeNext(PageImpl.java:66)
           com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
           com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
           java.base@13.0.1/java.lang.Iterable.forEach(Iterable.java:74)
           org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.lambda$listBlobsByPrefix$2(GoogleCloudStorageBlobStore.java:145)
           org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore$$Lambda$4566/0x00000008018a7440.run(Unknown Source)
           org.elasticsearch.repositories.gcs.SocketAccess.lambda$doPrivilegedVoidIOException$0(SocketAccess.java:54)
           org.elasticsearch.repositories.gcs.SocketAccess$$Lambda$4567/0x00000008018a6840.run(Unknown Source)
           java.base@13.0.1/java.security.AccessController.executePrivileged(AccessController.java:784)
           java.base@13.0.1/java.security.AccessController.doPrivileged(AccessController.java:554)
           org.elasticsearch.repositories.gcs.SocketAccess.doPrivilegedVoidIOException(SocketAccess.java:53)
           org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.listBlobsByPrefix(GoogleCloudStorageBlobStore.java:144)
           org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.listBlobs(GoogleCloudStorageBlobStore.java:129)
           org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobContainer.listBlobs(GoogleCloudStorageBlobContainer.java:47)
           app//org.elasticsearch.repositories.blobstore.BlobStoreRepository.deleteShardSnapshot(BlobStoreRepository.java:1321)
           app//org.elasticsearch.repositories.blobstore.BlobStoreRepository.access$400(BlobStoreRepository.java:159)
           app//org.elasticsearch.repositories.blobstore.BlobStoreRepository$1.doRun(BlobStoreRepository.java:682)
           app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773)
           app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
           java.base@13.0.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
           java.base@13.0.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
           java.base@13.0.1/java.lang.Thread.run(Thread.java:830)

This might answer my question on why it's running for such a long time. We do have almost a year worth of snapshots with 6 daily snapshots.

In order to create a new snapshot or delete an existing snapshot, elasticsearch has to load all existing shard level snapshots to figure out which files need to be copied and which files can be cleaned. The number of files to be checked is equal to number_of_shards * number_of_snapshots , which on a large clusters and frequent snapshots can lead to very long operation times especially with non-filesystem repositories

Here's the code which deletes snapshot:

Hi @proton

your best option to speed up those deletes would be to upgrade to ES version 7.8 or newer. I implemented deleting multiple snapshots at once in

and in addition to that, deletes were heavily optimised in 7.8+ and should only take a fraction of the time they take in earlier versions.
For 7.5 you could try increasing the number of threads in your snapshot thread pool by upping setting

thread_pool:
    snapshot:
        max: 50

This should help as well but by how much it depends on the exact data in your repository. 50 is just an example here, you could try going even higher if your nodes permit it resource wise.

1 Like

THANK YOU for improving this and allowing multiples - I just spent weeks purging old snapshots on a not-very-large cluster with bash scripts running overnight to purge old ones; was not aware of the snapshot pool.

Though surprised it has to cover all the snaps, and not just the ones before the one being deleted, as I thought it essentially figured out which old segments would be deleted and which need to be relinked to the next oldest snapshot, not having anything to do with later snapshots or current state - but maybe I misunderstand how it works (as not really documented) in terms of how deletes work in the incremental system.

@proton
If you can upgrade entire cluster well and good. This obviously is the best option.

If you can't upgrade right away, you can create a new bucket and define new repository. ES allows multiple repositories registered. Start taking snapshots in the new repository. Your first snapshot will take a long time depending on size of the data. But afterwords it should be back to normal. After 90 days unregister old repository and delete old repository GCS bucket.

@Armin_Braun
Thanks for the fix. Is it possible (with no negatives) to create a single node ES of 7.8, register same bucket as a repository and delete snapshot from this ES? Basically is there a downside of using same repository from 2 clusters.

1 Like

You can do that, but (and it's a serious but) you have to absolutely make sure to not concurrently write to the repository from both the 7.8 node and the cluster. Ideally, I would remove the repository from the cluster. Then add it to the single 7.8 node, do the deletes on it. And once deletes have gone through, power down the 7.8 node and add the repo back to the old version cluster node.
That seems to be a safe set of steps.

2 Likes

@Armin_Braun
Makes sense. Thanks

Armin, is there a good reference on how this works, especially how deletes are done, the disk space and performance, is it better to delete older or newer ones first, etc.? I'm always interested in the nuts & bolts.

@Steve_Mushero

we don't have a blog post or similar describing the snapshot mechanism at a high level. But we do have detailed technical documentation on both the snapshot process and the exact mechanics of how data is stored in and deleted from the repository in the package level java docs of ES.

See here for the snapshot state machine documentation and here for the mechanics of the repository.

3 Likes

Awesome - will dig into it; I also love digesting detailed docs into high level blogs; maybe I'll try when time permits.

@Armin_Braun - Wrote a blog on this, hope it's nearly all right (happy to hear any feedback, of course):