our elasticsearch indices are about 12TB and we have backups happening on daily basis with retention of 3 days. How can the snapshot storage s3 size be 60TB?
Is there any way to estimate the size if we retain snapshots for 14 days?
What does
GET /_cat/snapshots
show you? And is it what you expect to see?
Also look at
GET _snapshot/your_repository/your_snapshot/_status?pretty
for your repo/snapshots.
When you say your indices are about 12TB, how much of that is changing per day? e.g. do you get 1TB of new data per day, and delete 1TB of old data, to stay around the 12TB mark?
GET /_cat/snapshots gives me expected results i.e. 3 snapshots we have and the details of them.
GET _snapshot/your_repository/your_snapshot/_status?pretty -> this gives me following stats
...
"stats": {
"incremental": {
"file_count": 999,
"size_in_bytes": 24356736580
},
"total": {
"file_count": 31563,
"size_in_bytes": 6821125396380
},
"start_time_in_millis": 1738219207015,
"time_in_millis": 367952
}
...
Basically 12TB is the disk used by ES, here in snapshot it says overall indices for snapshot is 6TB. But s3 bucket where we store our snapshots is 55TB.
We have indices created per day and not automatically deleted. We delete them manually. However once we delete them we expect the s3 size to reduce after 3 days as that is our retention. Even if we do not delete the indices the s3 bucket size should be comparable to indices size as we have daily indices and those are not modified from next day. Correct me if I am wrong
Can you provide a little more context about this? Are you using searchable snapshots on frozen tier or normal snapshots? The retention of 3 days is for the indices or the snapshots? It is not clear.
Are you deleting the snapshots as well? Deleting indices in Elasticsearch will make no changes in the size of the Snapshots.
If you have 1 TB in your cluster and create a snapshot, you will have 1 TB in your snapshot, if tomorrow you delete this 1 TB of data and add new data resulting in another 1 TB and make a snapshot, you will them have 2 TB of snapshot, and this will go on until you remove some snapshots that you do not need.
we are using normal snapshots. Retention for snapshots is 3 days. Retention for indices is 30 days. So 30 days indices sum up to 12TB of our disk.
yes snapshots are deleted automatically using snapshot retention policy of 3 days.
So if my snapshots are only stored for 3 days how did the s3 bucket pile up?
How frequently do you take snapshots?
It sounds like you are using time-based indices - is your data immutable?
we take daily snapshots. And indices are also daily. Old Indices data is not modified
Are you using indices with the date in the index name?
Are you updating the index while it is active? At what age do indices not get updated any longer?
What is you average index and shard size?
Yes we use date in index name. So only Today's index will be and indices older than today will not be updated.
Average index size is ~700GB. we have 5 shards per index and 5 replicas. Shard size is around 120GB
Are you creating your snapshots manually in a way that each snapshot will only have the data for the specific day or are you using some index pattern?
Because if you have 30 days retention for your indices, and you create a snapshot using SLM, it will create a snapshot with the data for 30 days, it will only upload data that was not present on other snapshots, but the segments for the other indices will still be referenced in that snapshot.
So when you delete a snapshot it will not delete everything because some of the segments are still being referenced by other snapshots.
The explanation on this older similar question seems to fit your case.
As an extreme example, if you have an index that is never written to, and you snapshot it every day, then each of those snapshots will refer to the same segement file.
If you delete a single snapshot, then it will recover a very small amount of disk space (cluster state, and metadata about the snapshot), but you would have to remove every single snapshot before the segment file was deleted.
We are creating snapshot of all indices and not manually using the snapshot policies.
I get you point. But if I delete my indices today (todays snapshot is still referencing those indices) then after 3 days that snapshot will be deleted right, which means no snapshot is referencing those indices. But I see no space is being cleared, instead the size of bucket is growing constantly.
I have also tried to experiment with a test ES which has same indices configuration but snapshot retention to 1 day so that I can be sure its not pointing to deleted indices. Even in that case I see s3 size more than 5 times that of the actual disk size. Disk usage around 1.7TB. S3 bucket around 7.8TB.
stats for this setup:
"stats": {
"incremental": {
"file_count": 4026,
"size_in_bytes": 39458248728
},
"total": {
"file_count": 35533,
"size_in_bytes": 1009514608931
},
"start_time_in_millis": 1738385099891,
"time_in_millis": 178458
}
If you keep your data for 30 days in the cluster that is around 21TB of data, not 12TB.
Snapshots in Elasticsearch work at the segment level and it is important to note that segments are immutable. When you update or delete data the new data is written to a new segment and the old documents in existing segments are marked as deleted. The space is however not reclaimed until a segment merge happens, so the index can grow in size as you update and/or delete.
If you are writing and updating into todays index and then at some point take a snapshot of this (let us assume you time this to when no more indexing or updates will go to this index) you will store all the segemnst currently in existence. If some of these. existed when the previous snapshot was taken these will be reused.
At this point it is however possible that merging will kick in (you can even do this as part of ILM process), which will merge data into new segments and delete the existing ones that are no longer required. Even if no more data is changed in this index that is now read-only, it is possible that a significant portion of the segments have changes as a result of the merging and these will be snapshotted the next time a snapshot is taken. This means that this index may take up almost two times the size on disk in a worst case (maybe more if you did not time the snapshot perfectly to the index becoming read only).
Thanks for the explanation.
Regarding the average index size I just randomly choose one index for a value, but it actually varies quite significantly. We have indices which are 80GB to few being 700GB. Here is the screenshot of the disk usage which sums up to a value less than 12GB now as few indices are deleted.
And one more point, we have daily data streams (not indices) and i dont think that will make much of a difference?
maybe more if you did not time the snapshot perfectly to the index becoming read only -> Do you mean we should create snapshot at the time when new day's index is created??
That not how averages work
I presume you are getting the volume of bytes used in the S3 bucket from some S3 tooling itself. right? Nothing else is using that S3 bucket, right ?
Are you using ILM to manage the indices in the data stream? I'm guessing not as you wrote "We have indices created per day and not automatically deleted. We delete them manually".
Can you share the complete output from the
GET _snapshot/your_repository/your_snapshot/_status
for all your snapshots please.
Yes we get s3 volume bytes from the s3 monitoring itself and nothing else is stored there.
And yes we do not use ILM.
Below are results of snapshot status api.
feb-1
jan-31
jan-30
what is is ideal time for snapshot policy and also retaining snapshot policy?
Thanks for sharing the data.
first thing I notice is you are not taking snapshots of a specific index pattern, nor a data stream, you are snapshotting everything.
second thing I notice is the manual index deletion process has not been daily, e.g. several indices from late December were seemingly deleted on same day.
third thing I notice is that .ds-nfr-ontrack-2025.01.06-2025.01.06-000001 was missing from the first snapshot (feb1) but is there in the other 2. But .ds-nfr-ontrack-2025.01.05-2025.01.05-000001 and .ds-nfr-ontrack-2025.01.07-2025.01.07-000001 are in all 3 snapshots. So maybe someone deleted .ds-nfr-ontrack-2025.01.06-2025.01.06-000001 a bit early?
I am guessing there are a few errors creeping in.
The total reported size of all 3 snapshots is way less than 60TB. Well, its the sum of:
% jq '.snapshots.stats.total.size_in_bytes' snap1.json snap2.json snap3.json
4274445868309
6877399517217
6798431045213
which is
% echo $(( 4274445868309 + 6877399517217 + 6798431045213 ))
17950276430739
Please also note @Christian_Dahlqvist 's points above.
Do you have cloudTrail logging of S3 events? When your delete snapshot runs you should see a bunch of S3 files being deleted. e.g. between you snapshots on jan-30 and feb-1, the following indices are no longer referenced in the newer snapshot
.ds-nfr-ontrack-2024.12.25-2024.12.25-000001
.ds-nfr-ontrack-2024.12.26-2024.12.26-000001
.ds-nfr-ontrack-2024.12.27-2024.12.27-000001
.ds-nfr-ontrack-2024.12.28-2024.12.28-000001
.ds-nfr-ontrack-2024.12.29-2024.12.29-000001
.ds-nfr-ontrack-2024.12.30-2024.12.30-000001
.ds-nfr-ontrack-2024.12.31-2024.12.31-000001
.ds-nfr-ontrack-2025.01.06-2025.01.06-000001
But they still are in the older snapshot. But when you delete that snapshot, the one from Jan30, all those indices will not be referenced by any remaining snapshot.
@RainTown @Christian_Dahlqvist and @leandrojmp Thank you !
It was a miss on our side. We have versioning enabled on the bucket without lifecycle policy for the non current objects. As the bucket size includes also the size of objects with delete markers it is showing such high value in s3 cloudwatch metrics. I have just added lifecycle rule to delete the non-current objects .
Can you help us decide the timing for snapshot and retention. Should the snapshot creation time be exactly at point when new index is created and retention time roughly when the snapshot creation is completed?
That'd explain it for sure
The snapshot creation frequency depends on your RPO, but it's pretty common to take a snapshot every 30min. Remember that snapshots are deduplicated so any data that doesn't change between snapshots takes up no extra space.
The retention time is also entirely up to you, but it sounds like you're currently retaining data for 3 days which is a little shorter than most folks use but fine if you never want to look at any older data.
Thanks @DavidTurner ! This is basically our dev setup and our prod data is quite hugee which sums up to around 150TB!! For that we will set retention to 14 days. But because the amount of data that comes in is quite huge we feel 30 misn will not be sufficient to complete the backup and we may have 2 concurrent backups running. Our RPO is 1 day.
Two observations:
-
If ES cannot complete a halfhourly snapshot within 30min then there's a fair chance it won't complete an hourly snapshot within 60min, because the hourly snapshots will have twice as much new data to handle.
-
Concurrent snapshots are ok, there's no need to settle for a worse RPO just to avoid them.
Ultimately up to you, but we have users taking 30-min snapshots on systems with way more than 150TiB of data. But if your business only needs a RPO of 1 day then it would also be fine to take a snapshot every 6/8/12 hours.