Backup (snapshot) strategy for a two data node cluster


I hope you and your loved ones are safe and healthy.

I have a three node cluster with 2 data nodes and 1 voting node running Elastic Stack 7.13 on Ubuntu 20.04.2. All of the node are on a single physical workstation running ESXi.

All the three nodes have separate underlying storage medium for their OS and Elastic Data. Hence. Hence there is physical layer redundancy for the data (OS are on same SSD)

Node 1 has two disk in the VM:
1.A SSD-1 holds the OS (100 GB volume)
1.B SSD-2 holds the Elastic data (1.5 TB volume)

Node 2 has two disk in the VM:
2.A SSD-1 holds the OS (100 GB volume)
2.B SSD-3 holds the Elastic data (1.5 TB volume)

Node 3 has single disk with OS only

On this ESXi there is a 2 TB LUN mounted using iSCSI which is hosted and has redundancy via the NAS. Here is a graphical representation including the problem: I can mount a new volume only to one of data nodes. This leads to the obvious problem of backups not completing successfully since the voume is mounted only on one of the VMs.

How do I take snapshots with a single volume? Is the recommendation to have a shared volume on the NAS (such as SFTP / rsysnc) and carry out the backup there?

The volume need to be available to all nodes so using an NFS mount is common.

1 Like

Thank you very much for that @Christian_Dahlqvist .

Would it be correct to assume that size required will be equal to or 10% to 15% higher then the size of primary shards? Or the storage calculations requires to factor overall data size (primary + replica shards)

The initial snapshot will roughly be the size of the primary shards. If you add additional snapshots without deleting older ones this will grow as new and merged segments are snapshotted.

1 Like

Thank you very much. Have a wonderful time ahead. :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.