Currently Im working on Automating, cluster availability. So we have two elasticsearch clusters in two different regions
- Primary in CA.
- Stand by in ATL.
Each cluster have 2 servers each. We have snapshot enable in both, and the primary is the one currently serving to our application.
We use HAProxy to point primary or stand-by to our application depending if we need to perform maintenance or in case one fails we manually point to the one that is up.
The process of bringing them is pretty much manual and I want to do this in more of automatic way.
So the primary cluster perform a snapshot everyday at 12am which is dump in our TrueNAS. this is automatic replicated to a mount point in the same TrueNAS system and is accessible to Stand-by to be use in case we need to restore the cluster.
Stand-by cluster also perform a snapshot of its data and is dump into trueNAS in different location this is not replicated to primary.
I was wondering if ES has some mechanisim or any suggestion on how I can accomplish this.
I created a script that is triggered by a cronjob everyday at 6:00am in the stand-by cluster and that will remove all open indices and restore the snapshot from the primary cluster.
#!/bin/bash
Variables
SNAPSHOT=`/bin/ls -lahrt /es_snapshot/replica/production_snap | /bin/grep snap | /bin/awk '{print $9}' | /bin/grep "$(date '+%Y_%m_%d')"| /bin/sed s/snap-//g| /bin/sed s/.dat//g`
LOG_FILE="/var/log/elasticsearch/es_snapshot_restore_from_primary.log"
CURRENT=$(/usr/bin/curl -s XGET http://`/usr/bin/facter ipaddress`:9200/_cat/indices?v | grep open | awk '{print $3}')
DATE_LOG=`date +%Y_%m_%d_%T`
INDICES=(index1 index2 index3 index4l)
RESULT=()
# Restore Function
AUTO_RESTORE () {
if [ "$SNAPSHOT" != "" ]; then
echo "$DATE_LOG -- Checking status of Snapshot $SNAPSHOT" >> $LOG_FILE
SNAP_CHECK="/usr/bin/curl -s -XGET http://`/usr/bin/facter ipaddress`:9200/_snapshot/sjc_es_data/$SNAPSHOT/_status"
SNAP_RESULT=`$SNAP_CHECK | grep -o -i '"SUCCESS"'`
if [ "$SNAP_RESULT" == '"SUCCESS"' ]; then
echo "$DATE_LOG -- Restoring Snapshot $SNAPSHOT" >> $LOG_FILE
/usr/bin/curl -s -XPOST http://`/usr/bin/facter ipaddress`:9200/_snapshot/sjc_es_data/$SNAPSHOT/_restore
exit 0
else
echo "$DATE_LOG -- Snapshot $SNAPSHOT could not be restore. [FAIL]" >> $LOG_FILE
$SNAP_CHECK >> $LOG_FILE
exit 1
fi
else
echo "$DATE_LOG -- No snapshot found to restore" >> $LOG_FIL
exit 1
fi
}
# Looks for open indices
for i in ${INDICES[@]}
do
STATUS="/usr/bin/curl -s -XGET http://`/usr/bin/facter ipaddress`:9200/_cat/indices/$i?h=status"
RESULT+=($($STATUS))
done
# Deletes indices that are open and restore the snapshot from primary
MATCH=$(echo "${RESULT[@]:0}" | grep open)
if [[ ! -z $MATCH ]];then
echo "$DATE_LOG -- There is some open indices" >> $LOG_FILE
echo "$DATE_LOG -- Following indices will be erase in order to upload the snapshot "$CURRENT"" >> $LOG_FILE
for index in $CURRENT; do
/usr/bin/curl -XDELETE http://`/usr/bin/facter ipaddress`:9200/$index
sleep 2
done
AUTO_RESTORE
else
AUTO_RESTORE
fi
By the way this is version 2.3.1.