We have some 1.8TB data in a single index. The disk storage on this machine is 2TB and already reached 95%.
We have increased the flood limit to 98%, but that will only allow for 3-4 days of data more.
We cannot delete the index data, neither we are allowed to increase disk space as this is a baremetal server. (only allows adding a different new volume or increasing the same disk but with data loss)
Now, we have updated our system to push to a new index with monthly date tags something like index-2020-07-01. This will have monthly data from now on, but the older 1.8TB single index needs to be taken in a snapshot.
If we snapshot directly to s3 using repository plugins, does that require some storage to process and then push? Or it will work even if we have no storage left. Also, how much approx time it would take to snapshot 1.8TB to s3.
If we add a new separate volume/disk, can we create a snapshot there and then push it to s3?
We are a bit cautious before proceeding with any approach as we do not want to lose any of this data.
You can create multiple snapshot repositories which may also be on a different mount point. So it is indeed possible to order a new disk, create a repository on this disk and create the snapshot there.
So when you say multiple repositories, do you mean i can split my large index into multiple parts and save it in different snapshots. Because as far i know, we can only take snapshot of complete index, not parts of its data (let's say data query by single month).
What he meant is that you can take snapshots of different indices and put them in differnt places. As far as I know you can only snapshot one complete index (as you said).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.