After reading posts about backup/restore elasticsearch I came to a
solution fitting to the simple situation to be supported.
It would be nice to get some feedback if something is wrong or
overlooked by me.
Perhaps it can be used be others as a template being in the same
situation as I didn't found concrete configuration examples so far.
The Situation:
ES will be deployed on two machines, one machine is in a kind of hot
standby, i.e. no user activity will happen on it.
There is no filesystem shared between both machines.
ES node and client are deployed on the same machine and connected
locally.
We want to have ES to spawn a cluster over both machines having all
data fully replicated on both machines.
If one machine is getting to be unavailable, the other machine can
take over without any administrative interaction.
For the backup, it doesn't matter what machines filesystem is used as
both are having fully replicated data.
Network traffic for searches and gets will stay on the same machine.
The configuration: (assuming the machines are having the ips
192.168.1.200 and 192.168.1.201)
Machine with ip: 192.168.1.200
cluster:
name: es-backuped
routing.allocation.awareness:
attributes: machine
force.machine.values: A, B
node:
machine: A
local: false
discovery.zen.ping:
multicast.enabled: false
unicast.hosts: 192.168.1.201
Machine with ip: 192.168.1.201
cluster:
name: es-backuped
routing.allocation.awareness:
attributes: machine
force.machine.values: A, B
node:
machine: B
local: false
discovery.zen.ping:
multicast.enabled: false
unicast.hosts: 192.168.1.200
I've used unicast as we do not need any dynamic cluster changes.
To backup the data, we will check the cluster state to ensure nodes
are in sync, then we can either shutdown the node on the inactive
machine or we are using the settings api (http://www.elasticsearch.org/
guide/reference/api/admin-indices-update-settings.html) to disable
flush or set all indexes into the readonly mode.
After backup is done we will start the node again or revert the
setting changes we've done before.
In fact, if the settings api is used, the configuration might work for
balanced machines (both active) as well.
Would that work propperly? Is it a propper solution for the given
scenario?
Thanks
Claas