We are using Elasticsearch for the first time and are preparing to back up a cluster that is approximately 600 GB in size. Before we proceed with configuring the backup, we need to perform some calculations to inform the IT team about the required disk space. Can anyone help me to understand the ratio of backup size in relation to the Elasticsearch cluster size? Is the backup size typically the same as or nearly equal to the cluster size?
Additionally, I have a second question regarding snapshots. If we take snapshots that are always full backups, will the subsequent snapshot be an incremental backup that only includes data changes, or will it be a full backup again? If it is incremental, will we need all previous backups for restoration?
I would appreciate any clarification on this matter.
Thanks for your reply. I would like to confirm that I had a backup for three days. Unfortunately, my ELK cluster crashed on the fourth day. Should I apply the latest backup to restore the cluster to its most recent state, or do I need to restore all three days of backups?
Thanks @stephenb and @dadoonet . I had a test index (2 MB size) having 1 shard and no replica but while taking snapshot, we can see it created a folder indices inside the repository. Then it created almost 22 files as below. Could you please help me to understand what are these files and what happens if any of the file got corrupted under this location.
-rw-r--r-- 1 elasticsearch elasticsearch 6624 Jun 25 05:08 __6F2F1OZhQa6Q-PROW98b1A
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __a70BmZ_ETU2l0Yblz1ncgw
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __b9oxXFZdQV6BdGeq1I33Pg
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:06 __cTHSljJxReWvZ4UOy7BXoQ
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __dV38DfhjSZmwpMKJ9d1_cQ
-rw-r--r-- 1 elasticsearch elasticsearch 546763 Jun 25 05:04 __gltEL0OLTs-E9sPq5miz7w
-rw-r--r-- 1 elasticsearch elasticsearch 6656 Jun 25 05:04 __H-HoD-KHSqWZTxhRblkDNg
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __HuF3eoo1RWSZQmKyollW9Q
-rw-r--r-- 1 elasticsearch elasticsearch 2438 Jun 25 05:11 index--0hDTB7ISauA9WkfycOSXA
-rw-r--r-- 1 elasticsearch elasticsearch 6624 Jun 25 05:04 __jRKaTRqwT4O-Dl4AHsXR0w
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __k4lFEUBWRvGbGZ1h-0xe3A
-rw-r--r-- 1 elasticsearch elasticsearch 256190 Jun 25 05:04 __LWvI_EifSvidSC3S-NU1qQ
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __NRj50aZqTlaxhYprYNs6KQ
-rw-r--r-- 1 elasticsearch elasticsearch 6624 Jun 25 05:06 __Qb-FZzqxTLu9eayANi5N-w
-rw-r--r-- 1 elasticsearch elasticsearch 2375 Jun 25 05:08 snap-rrCW9aFwRrKPESnkHTmFuQ.dat
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:08 __Son4y1pdTJ6A-9rRdffIBQ
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __VCrwWGHlQ86xFAYS5kVS9w
-rw-r--r-- 1 elasticsearch elasticsearch 6656 Jun 25 05:04 __VVXzyb_6T9uVFziHqI3-Uw
-rw-r--r-- 1 elasticsearch elasticsearch 6624 Jun 25 05:04 __Wq2LWWuPS26bX5o1QkoSkA
-rw-r--r-- 1 elasticsearch elasticsearch 595 Jun 25 05:04 __ZrHtTf_-Q3CMv0rouXyXXg
-rw-r--r-- 1 elasticsearch elasticsearch 638953 Jun 25 05:04 __ZuWTezzOTBicUddyRSZbIA
Er, maybe the result of taking the snapshot ? btw, I count 21 files. Which is “almost 22” indeed.
Then it’s possible, even likely, you cannot restore your index from the snapshot. Why are you asking? If it’s a test, corrupt the files yourself and see what happens.
There’s something a little off on this thread. You started with:
But you yourself have created dozens of threads on this forum over a significant period of time? Please remember the people reading and answering here are volunteers.
Each shard is constructed of one or more segments, so those files most likely just represent segments.
Segments are the actual files that live on a file system that contain the elasticsearch index data.
They get created when indexing and eventually they may be merged down into a fewer number of segments or even a single segment depending on the settings.
But the snapshot will represent whatever segments are available at the time of the snapshots
If that happens then your snapshot data will probably not be restorable. You need to protect the integrity of these files very carefully. See these docs for more information:
If something other than Elasticsearch modifies the contents of the repository then future snapshot or restore operations may fail, reporting corruption or other data inconsistencies, or may appear to succeed having silently lost some of your data.
Today, we encountered a scenario that requires assistance in configuring our backup plan. In this scenario, I conducted a test on an index to provide a broader perspective at the cluster level. I initially created an index with a size of 2 GB and took my first snapshot, which was also approximately 2 GB. Afterward, I added 1 GB of data, and upon running my snapshot, I observed that the backup size had increased to 3 GB and it is expected.
Subsequently, I deleted nearly 2 GB of data (resulting in a cluster size of 1 GB). However, when I took another snapshot on the same old repository, the size on the operating system level had increased to nearly 4 GB. This means that the backup size exceeded the cluster size, even after deleting the older snapshots from kibana console.
When I created a new repository and took a snapshot, the size was as expected at 1 GB. How can we address this issue of backup size bloating at the OS level for the old repository? Is it unavoidable, or should we create a new repository and take fresh snapshots whenever there are changes in the cluster data?
How did you determine that the cluster had 1GB cluster size at that point in time?
If it was via the (understandable) logic of “ I created 3gb of data, but deleted 2gb of that data, so therefore I now have only 1Gb of data in my cluster “, then the logic is slightly flawed. In short form, data is not “deleted” from the disk immediately when you delete a document from an index, rather it’s marked as a deleted document at that point in time. It gets tidied up “later”. Eg If you look at /_cat/indices output it shows some stats on deleted documents.
There’s some devil in details here, if your data is all in one index, created over multiple indices, are you deleting docs or indices, how specifically you are creating your snapshots, etc.
I don’t know if your goal here is still just to estimate sizes for your real data snapshots, or you are now more trying to build your knowledge on the implementation specifics, or something else. All are fine.
@dadoonet I had deleted the data from Index using below REST API command.
POST /employee/_delete_by_query
GET _cat/indices/employee?v&h=index,store.size
My data lies on on one Index and one shard. My objective is to develop an efficient backup strategy that handle the issue of index purging from the Elasticsearch cluster after the retention period and OS level also snapshots size should not bloated.
So you created more segments to mark as removed the data you want to eventually remove. Elasticsearch had to save those new files.
It's also possible that some merge of segments happened, which means new files representing the whole dataset in another way.
Thanks @dadoonet . So in this case how we can create backup strategy so that the snapshots in OS level should not contain any space which is already removed in cluster level.
You have not told us a lot about your use case. For logs and metrics use cases data retention is managed by deleting complete indices, so if you have data of this type your test would not be realistic nor representative.
If your use case insert, update and delete from a set of fixed indices that are not aged out it is harder to estimate snapshot size requirements as all these operations result in additional segments being created before the old ones eventually are removed and stop being included in snapshots.
sorry @Christian_Dahlqvist, I apologize for the confusion. On our customer site, we needed to insert data into a daily index through ILM policy. Our goal is to maintain a 30-day index in our cluster, dropping the day one index on the 31st day. Could you please help me configure our backup strategy? In this scenario, will the snapshot not bloat at the OS level?
Unfortunately, we cannot test this scenario in our lower environment due to resource constraints. Therefore, I plan to test it on an individual index by adding and deleting data from it. The same backup strategy will apply in the case of the cluster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.