Copy ES indexes back to new clean disk , feasible to work?


(Yu Fei Cai) #1

Hello
We have elasticsearch directories "data, work, logs" in one local disk /dev/xvdc1, the indexes data are in "data" directory
The local disk will be used up soon, for some reason we only can change the existing local disk /dev/xvdc1 to larger , but the change action will clean up all data on it . We have two ES nodes , replica number is 1
In order to use larger disk , My steps :
Stop node1 ES service, keep node2 ES running to provide service
on node1 , back up the "data, work, logs" directories to some place
on node1 , Change the disk with new clean one, copy back the "data, work, logs" directories , start ES service
Then do the same thing on node2
Is this copy approach feasible to work without data lost ?


(Ed) #2

Since you have replication of 1 you will not loose any data, since your on Amazon,

Why not just spawn up a new machine and decomission the old smaller node
https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-filtering.html

This way you don't have to play with mount points and worry.

Or you can use the above to migrate the data off while you do your changes.

But with replication 1 you can completely kill a node delete everything and start it up again. ELK will then re-replicate the data again.


(Ed) #3

but use Head or Kopf or some gui to make sure you have a shard of each "number' on each server.

I would make sure the cluster is in yellow state after killing the server. (Before you delete anything) if it goes red then you for some reason don't have healthy replication

(do backups before deleting just incase) at least till you feel comfortable


(Yu Fei Cai) #4

Thanks for your reply !

I tested on my two Elasticsearch test nodes, it worked well by removing the indexes directory and copying back the indexes directory, but I found all primary shards were allocated on one same node , instead of allocating on two nodes on average
====My ES servers ==== :
two nodes : esdev1, esdev2 , shard number is 12 per node , replica is 1 ,
in esdev1 elasticsearch.yml :
node.rack: rack1
cluster.routing.allocation.awareness.attributes: rack
in esdev2 elasticsearch.yml :
node.rack: rack2
cluster.routing.allocation.awareness.attributes: rack
Now the 12 primary shards are allocated on two nodes on average . each node has 6 primary shards
[][root@esdev1 ~]# curl http://ES_F5_IP:9200/_cat/shards?pretty | grep dianha | grep p | grep esdev1
dianha 3 p STARTED 0 115b ..11.63 esdev1
dianha 1 p STARTED 0 115b ..11.63 esdev1
dianha 7 p STARTED 0 115b ..11.63 esdev1
dianha 11 p STARTED 0 115b ..11.63 esdev1
dianha 9 p STARTED 0 115b ..11.63 esdev1
dianha 5 p STARTED 0 115b ..11.63 esdev1
[][root@esdev1 ~]# curl http://ES_F5_IP:9200/_cat/shards?pretty | grep dianha | grep p | grep esdev2
dianha 10 p STARTED 1 2.4kb ..11.64 esdev2
dianha 6 p STARTED 0 115b ..11.64 esdev2
dianha 2 p STARTED 0 115b ..11.64 esdev2
dianha 0 p STARTED 0 115b ..11.64 esdev2
dianha 4 p STARTED 0 115b ..11.64 esdev2
dianha 8 p STARTED 0 115b ..11.64 esdev2

#Then starting my actions :
keep esdev2 node ES running to provide service

On esdev1 node :

back up the indexes directory "..../data" to some place
stop ES service
delete indexes directory "../data"

#On esdev2 node :
Insert some data into existing index by API , insert successfully . (To confirm esdev2 can provide service when esdev1 is down)
eg : curl -XPUT https://ES_F5_IP:9200/dianha/haoma/2 -d '{"name":"12366922"}'

Then check the shards allocation , and found all 12 primary shards are on esdev2
(Is it expected ?)
[][root@esdev2 ~]# curl http://ES_F5_IP:9200/_cat/shards?pretty | grep dianha | grep p
dianha 3 p STARTED 0 79b ..11.64 esdev2
dianha 10 p STARTED 1 2.4kb ..11.64 esdev2
dianha 6 p STARTED 0 79b ..11.64 esdev2
dianha 2 p STARTED 0 79b ..11.64 esdev2
dianha 1 p STARTED 0 79b ..11.64 esdev2
dianha 0 p STARTED 0 79b ..11.64 esdev2
dianha 7 p STARTED 0 79b ..11.64 esdev2
dianha 11 p STARTED 1 2.4kb ..11.64 esdev2
dianha 9 p STARTED 0 79b ..11.64 esdev2
dianha 5 p STARTED 0 79b ..11.64 esdev2
dianha 4 p STARTED 0 79b ..11.64 esdev2
dianha 8 p STARTED 0 79b ..11.64 esdev2

#On esdev1 node ,
Replace the old disk with the new disk , format the new disk and create the directory for indexes data , mount the new directory.
Copy back the indexes data from the backup directory to the new directory,
Start ES service
ES was started successfully without any exceptions

Check

Check the cluster health , it is green , all shards were assigned successfully .
Search the data by API , all old and new data can be found .
Check the shard allocation again ,
All 12 primary shards are still on esdev2 ;
I also tried to insert new data ( to expect the primary shards to be automatically
allocated on two nodes on average ) , but the all 12 primary shards are still on esdev2 , all 12 replica shards on esdev1 nodes .
It seems like my allocation awareness setting in elasticsearch.yml is invalid after my action were done .
We absolutely need to allocate the primary shards on two nodes on average , just like the status before I did .

I also tried to reroute by command :
curl -XPOST http://ES_F5_IP:9200/_cluster/reroute?pretty -d '{
"commands" : [ {
"allocate" : {
"index" : "dianha",
"shard" : 1,
"node" : "esdev1",
"allow_primary" : true
}
}
]
}'
but 400 error , unassigned shards is 0

Then I keep esdev1 running , do the same thing to change disk for esdev2 , the similar shard allocation happened - all 12 primary shards are on esdev1 , all 12 replica shards are on esdev2 .

Any idea from you ?


(Yu Fei Cai) #5

Any reply to help me ?


(Yu Fei Cai) #6

I found the primary shards allocation issue has no relationship with my indexes directory delete/copy back
close this ticket now
Summary : it works to delete the index directory and copy it back .


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.