Suppose there are two nodes n1 and n2. n1 contains all primary shards and n2 contains all replica shards.
At moment t1 , n1 contains 1000 docs and n2 contains 1000 docs . Now at t2 , n2 goes down somehow and 500 more docs are added to n1 .How many docs will be threre in n2 (which comes back at t3) when n1 contains 1500 docs ? Also what happens(I mean it gets deleted or something) to those 1000 docs of n2 which it has had at t1 ? Does it re-index whole data once again while making replica shards in node n2? Can someone explain the process behind the scene.
n2 will come back up and n1 (which will be the current master) will ask n2 what it has on disk for shards. n1 will then decide to assign all those unassigned shards back to n2.
For each shard n2 will pull all the files from n1 that it needs to make its disk exactly the same as the shard on n1. It'll then pull the transaction log from n1 and reply that against its files on disk. Then it'll send n1 a message telling it that it has finished restoring that shard. Then it'll start on another one.
I've glossed over lots of parts, mostly because I don't know much about them. But that is the gist of it. There are some things that let you skip the copy file step but they only work if the shards contain exactly the same documents so your scenario about indexing 500 more documents invalidates them anyway.
thanks for your reply Nik , I have still little doubt that whether those 1000 docs of n2 will still be there on disc and not deleted ? Also does it re index whole data once again for making segments on n2 shards?