How much disk storage required to take backup of index which is having 300GB?

Jax_dev · March 16, 2021, 9:48pm

If we want to take a back of one of the indices which is having 300GB, enabling compression - true.

How much space snapshot will occupy?

How do we take snapshot to azure blob storage? Elastic server cluster is on azure Ubuntu vm.

Christian_Dahlqvist · March 17, 2021, 7:46am

I would guess it will take up about the same size as the size of the primary shards, given that most segment data already is compressed and will not compress much further.

DavidTurner · March 17, 2021, 8:52am

That's right: the docs do say that compress: true applies only to metadata like settings and mappings and has no effect on the actual data.

Jax_dev · March 17, 2021, 10:34am

when I have given curl -XGET "http://localhost:4200/_cat/shards?v" for one of the indices it has shown around 302 GB. But when I look at the data path disk utilized is only around 40GB. Will the real data is compressed while storing in the disk ?

DavidTurner · March 17, 2021, 11:23am

Can you share the actual outputs? GET _cat/shards returns a lot of numbers, it's hard to know exactly which ones you're asking about.

Jax_dev · March 17, 2021, 11:39am

if you see under store it is showing around 300 GB, but in actual mounted disk it occupied only 30GB TO 40GB. Is this possible ?

DavidTurner · March 17, 2021, 11:51am

Hmm that's pretty weird. How are you determining that the actual disk usage is only 30GB?

Jax_dev · March 17, 2021, 12:29pm

sorry for confusion. I have rechecked path.data in all the 4 available elastic servers. In one of the server my colleague has given var/lib but not in data/elastic mount, I'm checking the wrong mount. It got clarified.

But the indices (which i have shared in screenshot ) is available only in 2 servers, any specific reason for this ? primary in one server and replica in another server. Same indices (which i have mentioned in screenshot ) is not there in other 2 servers ?

i can see in the configuration in all the nodes, role is mentioned as data node as well. Can you clear out this confusion for me, what exactly happening ? Let me know if you need any info to further to debug on this.

I want to take snapshot of indices - productioncustomerdata, is it ok if I take from server where primary is available ? Do I have to take snapshot of replica as well ? So after taking snapshot of primary indices, if I delete the indices, will it also delete replica as well ?

DavidTurner · March 17, 2021, 12:52pm

There's only two copies of this shard, one primary and one replica, so you can only have it on two nodes.

Jax_dev · March 17, 2021, 12:56pm

I have 4 elastic servers (data nodes) right, it should not be available in all 4 data nodes ?

DavidTurner · March 17, 2021, 1:21pm

Yes that's correct. You can add more replicas if you want.

Jax_dev · March 17, 2021, 1:33pm

when query request comes from application, which server it will select ? Is it primary or replica ?

warkolm · March 22, 2021, 1:55am

It could be either one.

system · April 19, 2021, 1:56am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Snapshot repository size estimates Elasticsearch	4	6001	October 8, 2019
Disk space measure for Elasticsearch service Elasticsearch	2	165	January 9, 2024
How the real data is stored in data disk? Elasticsearch	2	368	April 19, 2021
Understanding snapshot and restore Elasticsearch	5	1642	November 29, 2018
Performance during Snapshot Backups/Cross-Cluster replication Elasticsearch snapshot-and-restore	4	446	March 16, 2023

How much disk storage required to take backup of index which is having 300GB?

Related topics