Any suggestions as to how I can create full ES backups without using
snapshot functionality?
The reason I can't use snapshots is because they require a shared directory
mounted on all nodes, but my 3-node cluster spans two data centres and I am
not able to NFS mount over the WAN. I'm also not permitted to backup to
AWS/S3.
As I have 2 replicas of each index, I'm leaning towards the idea of
stopping one node and backing up that node's data directory but wondered if
anyone could suggest a more elegant way. For example, could I snapshot to
a local directory on each node, then manually combine the contents into a
single cohesive backup?
How many shards for each index? I am assuming that each node does not have
all the data.
If you can stop indexing, you can just rsync the data to a local directory.
Make sure you execute a flush and preferably an optimize in order to merge
the segments on disk. The trick part is the manual combine you referred to.
BTW, 3 nodes/2 data centers? Sounds like a recipe for trouble.
Any suggestions as to how I can create full ES backups without using
snapshot functionality?
The reason I can't use snapshots is because they require a shared
directory mounted on all nodes, but my 3-node cluster spans two data
centres and I am not able to NFS mount over the WAN. I'm also not
permitted to backup to AWS/S3.
As I have 2 replicas of each index, I'm leaning towards the idea of
stopping one node and backing up that node's data directory but wondered if
anyone could suggest a more elegant way. For example, could I snapshot to
a local directory on each node, then manually combine the contents into a
single cohesive backup?
On Wednesday, November 19, 2014 5:32:14 PM UTC-8, Ivan Brusic wrote:
How many shards for each index? I am assuming that each node does not have
all the data.
If you can stop indexing, you can just rsync the data to a local
directory. Make sure you execute a flush and preferably an optimize in
order to merge the segments on disk. The trick part is the manual combine
you referred to.
BTW, 3 nodes/2 data centers? Sounds like a recipe for trouble.
Cheers,
Ivan
On Wed, Nov 19, 2014 at 7:41 PM, Mathew D <mathew.d...@gmail.com
<javascript:>> wrote:
Hi there,
Any suggestions as to how I can create full ES backups without using
snapshot functionality?
The reason I can't use snapshots is because they require a shared
directory mounted on all nodes, but my 3-node cluster spans two data
centres and I am not able to NFS mount over the WAN. I'm also not
permitted to backup to AWS/S3.
As I have 2 replicas of each index, I'm leaning towards the idea of
stopping one node and backing up that node's data directory but wondered if
anyone could suggest a more elegant way. For example, could I snapshot to
a local directory on each node, then manually combine the contents into a
single cohesive backup?
Thanks for the quick response. We've got 5 shards per index, so with 2
replicas each node should in theory have a full set of data. I was hoping
that taking the node out of service by stopping it would avoid disruption
as a result of pausing indexing, but I couldn't find any documentation to
confirm if such an operation would leave the data files in a consistent
state that could reliably be used for restore.
Evan's suggestion of elasticdump looks like the closest to what I'm after,
although unfortunately I don't have node.js/npm installed (and being an
enterprise could be tricky to get installed).
NB I hear your concerns re cluster design. Incorporating the remote node
was chosen to minimise data loss following a data centre failure, however
because of the risk of split brain, the node actually functions more of a
warm DR than any sort of HA...
Regards,
Mat
On Thursday, November 20, 2014 2:32:14 PM UTC+13, Ivan Brusic wrote:
How many shards for each index? I am assuming that each node does not have
all the data.
If you can stop indexing, you can just rsync the data to a local
directory. Make sure you execute a flush and preferably an optimize in
order to merge the segments on disk. The trick part is the manual combine
you referred to.
BTW, 3 nodes/2 data centers? Sounds like a recipe for trouble.
Cheers,
Ivan
On Wed, Nov 19, 2014 at 7:41 PM, Mathew D <mathew.d...@gmail.com
<javascript:>> wrote:
Hi there,
Any suggestions as to how I can create full ES backups without using
snapshot functionality?
The reason I can't use snapshots is because they require a shared
directory mounted on all nodes, but my 3-node cluster spans two data
centres and I am not able to NFS mount over the WAN. I'm also not
permitted to backup to AWS/S3.
As I have 2 replicas of each index, I'm leaning towards the idea of
stopping one node and backing up that node's data directory but wondered if
anyone could suggest a more elegant way. For example, could I snapshot to
a local directory on each node, then manually combine the contents into a
single cohesive backup?
Thanks for the quick response. We've got 5 shards per index, so with 2
replicas each node should in theory have a full set of data. I was hoping
that taking the node out of service by stopping it would avoid disruption
as a result of pausing indexing, but I couldn't find any documentation to
confirm if such an operation would leave the data files in a consistent
state that could reliably be used for restore.
Evan's suggestion of elasticdump looks like the closest to what I'm after,
although unfortunately I don't have node.js/npm installed (and being an
enterprise could be tricky to get installed).
NB I hear your concerns re cluster design. Incorporating the remote node
was chosen to minimise data loss following a data centre failure, however
because of the risk of split brain, the node actually functions more of a
warm DR than any sort of HA...
Regards,
Mat
On Thursday, November 20, 2014 2:32:14 PM UTC+13, Ivan Brusic wrote:
How many shards for each index? I am assuming that each node does not
have all the data.
If you can stop indexing, you can just rsync the data to a local
directory. Make sure you execute a flush and preferably an optimize in
order to merge the segments on disk. The trick part is the manual combine
you referred to.
BTW, 3 nodes/2 data centers? Sounds like a recipe for trouble.
Any suggestions as to how I can create full ES backups without using
snapshot functionality?
The reason I can't use snapshots is because they require a shared
directory mounted on all nodes, but my 3-node cluster spans two data
centres and I am not able to NFS mount over the WAN. I'm also not
permitted to backup to AWS/S3.
As I have 2 replicas of each index, I'm leaning towards the idea of
stopping one node and backing up that node's data directory but wondered if
anyone could suggest a more elegant way. For example, could I snapshot to
a local directory on each node, then manually combine the contents into a
single cohesive backup?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.