Unassigned shard by rsync index

Hi Guys,

I'm using rsync to migrate my elasticsearch index between different clusters, i know Snapshot And Restore and elasticsearch-dump can do same things.
But

  • elasticsearch-dump can't migrate big index(my migrate index about 10GB)
  • Snapshot And Restore using file system, i want migrate index between clusters without any stops or pauses.

So we use rsync migrate index from source cluster to destination cluster, and rebuild index at destination cluster.

The problem is elasticsearch can't rebuild the migrate index, the migrate index become unassigned shard.

Here is the kibana screenshot:

Here is the error log:

[2017-11-01 10:47:04,798][WARN ][indices.cluster          ] [gh-data-rt0201] [[log.direct-engine-server.20171101_68][0]] marking and sending shard failed due to [failed recovery]
[log.direct-engine-server.20171101_68][[log.direct-engine-server.20171101_68][0]] IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: TranslogCorruptedException[expected shard UUID [[42 43 79 7a 6b 76 76 74 53 38 57 36 6f 74 47 31 57 73 57 4b 32 41]] but got: [[31 49 56 70 62 37 6e 50 51 43 4f 42 63 4a 30 2d 31 36 44 43 4b 51]] this translog file belongs to a different translog];
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:250)
        at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
        at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: [log.direct-engine-server.20171101_68][[log.direct-engine-server.20171101_68][0]] EngineCreationFailureException[failed to create engine]; nested: TranslogCorruptedException[expected shard UUID [[42 43 79 7a 6b 76 76 74 53 38 57 36 6f 74 47 31 57 73 57 4b 32 41]] but got: [[31 49 56 70 62 37 6e 50 51 43 4f 42 63 4a 30 2d 31 36 44 43 4b 51]] this translog file belongs to a different translog];
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:155)
        at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
        at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1515)
        at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1499)
        at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:972)
        at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:944)
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
        ... 5 more
Caused by: TranslogCorruptedException[expected shard UUID [[42 43 79 7a 6b 76 76 74 53 38 57 36 6f 74 47 31 57 73 57 4b 32 41]] but got: [[31 49 56 70 62 37 6e 50 51 43 4f 42 63 4a 30 2d 31 36 44 43 4b 51]] this translog file belongs to a different translog]
        at org.elasticsearch.index.translog.TranslogReader.open(TranslogReader.java:235)
        at org.elasticsearch.index.translog.Translog.openReader(Translog.java:377)
        at org.elasticsearch.index.translog.Translog.recoverFromFiles(Translog.java:334)
        at org.elasticsearch.index.translog.Translog.<init>(Translog.java:179)
        at org.elasticsearch.index.engine.InternalEngine.openTranslog(InternalEngine.java:208)
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:151)
        ... 11 more
[2017-11-01 10:47:05,031][WARN ][indices.cluster          ] [gh-data-rt0201] [[log.direct-engine-server.20171031_68][0]] marking and sending shard failed due to [failed recovery]
[log.direct-engine-server.20171031_68][[log.direct-engine-server.20171031_68][0]] IndexShardRecoveryException[failed to recovery from gateway]; nested: EngineCreationFailureException[failed to create engine]; nested: TranslogCorruptedException[expected shard UUID [[77 65 57 68 36 68 50 48 52 31 79 45 73 68 48 74 43 4c 58 43 72 67]] but got: [[53 46 5f 7a 48 79 6f 71 54 52 69 43 69 50 72 6c 6f 66 77 34 6d 51]] this translog file belongs to a different translog];
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:250)
        at org.elasticsearch.index.shard.StoreRecoveryService.access$100(StoreRecoveryService.java:56)
        at org.elasticsearch.index.shard.StoreRecoveryService$1.run(StoreRecoveryService.java:129)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: [log.direct-engine-server.20171031_68][[log.direct-engine-server.20171031_68][0]] EngineCreationFailureException[failed to create engine]; nested: TranslogCorruptedException[expected shard UUID [[77 65 57 68 36 68 50 48 52 31 79 45 73 68 48 74 43 4c 58 43 72 67]] but got: [[53 46 5f 7a 48 79 6f 71 54 52 69 43 69 50 72 6c 6f 66 77 34 6d 51]] this translog file belongs to a different translog];
        at org.elasticsearch.index.engine.InternalEngine.<init>(InternalEngine.java:155)
        at org.elasticsearch.index.engine.InternalEngineFactory.newReadWriteEngine(InternalEngineFactory.java:25)
        at org.elasticsearch.index.shard.IndexShard.newEngine(IndexShard.java:1515)
        at org.elasticsearch.index.shard.IndexShard.createNewEngine(IndexShard.java:1499)
        at org.elasticsearch.index.shard.IndexShard.internalPerformTranslogRecovery(IndexShard.java:972)
        at org.elasticsearch.index.shard.IndexShard.performTranslogRecovery(IndexShard.java:944)
        at org.elasticsearch.index.shard.StoreRecoveryService.recoverFromStore(StoreRecoveryService.java:241)
        ... 5 more

From the log seems is different translog, but i migrate same index name from source cluster to destination cluster, the two maybe same index why can't assign the shard.And how can i assign the shard successful?

Did anybody can help me, thanks~

rsyncing won't work, there are a bunch of things outside the index that need to be included (eg cluster state) to be able to restore the index.

What about doing a remote reindex?

Remote reindex seems a good idea, but my company forbid communication between clusters, so we use rsync migrate index from source cluster to a middle node, and migrate index from middle node to destination cluster.

As your say i ensure rsync migrate the whole index(include shard and it's _state), i migrate 6 index has 1 index can assign but rest of indices cannot assign.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.