We are running 2 instances of elasticsearch running version 6.8.1. The first instance in our production environment and the second one is in our development environment. We're attempting to replicate the data from the production site to our development site so we can use it in testing for an application that will be hitting elasticsearch.
We've opened port 9300 between these two environments and I can see in the Kibana GUI that the production cluster is connected under Remote Clusters. However when I create a follower index in order to test the replication from production to development the shards on the development cluster are failing to allocate, stating connect_timeout
[2020-04-13T15:56:47,245][WARN ][o.e.c.r.a.AllocationService] [devMaster] failing shard [failed shard, shard [builds-20200410][3], node[fwXUzBLmQDWEAjTLPJCVCw], [P], recovery_source[snapshot recovery [GlgFqTxrRIGxpWIldwRPdg] from _ccr_production:_latest_/_latest_], s[INITIALIZING], a[id=3FBNINKYSk2eXzVb-dw5ww], unassigned_info[[reason=ALLOCATION_FAILED], at[2020-04-13T15:55:16.414Z], failed_attempts[4], failed_nodes[[jItPymr0QzifCI9Km3UkOg, Zpl7miC4T_SFsm2RRBUi1A, bxgXoKIBTpKYyEADXfCFTg, fwXUzBLmQDWEAjTLPJCVCw]], delayed=false, details[failed shard on node [Zpl7miC4T_SFsm2RRBUi1A]: failed recovery, failure RecoveryFailedException[[builds-20200410][3]: Recovery failed on {DevMaster2}{Zpl7miC4T_SFsm2RRBUi1A}{xE00h9O8Sbq7PMuUK1rjqw}{DevMaster2}{IP:9300}{dilm}{ml.machine_memory=135020195840, xpack.installed=true, ml.max_open_jobs=20}]; nested: IndexShardRecoveryException[failed recovery]; nested: IndexShardRestoreFailedException[restore failed]; nested: ConnectTransportException[[][IP:9300] connect_timeout[30s]]; ], allocation_status[fetching_shard_data]], expected_shard_size[0], message [failed recovery], failure [RecoveryFailedException[[builds-20200410][3]: Recovery failed on {DevMaster3}{fwXUzBLmQDWEAjTLPJCVCw}{EJvLsrXsTX2pdZ2SvJsYFQ}{DevMaster3}{IP:9300}{dilm}{ml.machine_memory=67378692096, xpack.installed=true, ml.max_open_jobs=20}]; nested: IndexShardRecoveryException[failed recovery]; nested: IndexShardRestoreFailedException[restore failed]; nested: ConnectTransportException[[][IP:9300] connect_timeout[30s]]; ], markAsStale [true]]
org.elasticsearch.indices.recovery.RecoveryFailedException: [builds-20200410][3]: Recovery failed on {DevMaster3}{fwXUzBLmQDWEAjTLPJCVCw}{EJvLsrXsTX2pdZ2SvJsYFQ}{DevMaster3}{IP:9300}{dilm}{ml.machine_memory=67378692096, xpack.installed=true, ml.max_open_jobs=20}
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$17(IndexShard.java:2584) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.5.2.jar:7.5.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: org.elasticsearch.index.shard.IndexShardRecoveryException: failed recovery
at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:353) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromRepository(StoreRecovery.java:283) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.IndexShard.restoreFromRepository(IndexShard.java:1867) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$17(IndexShard.java:2580) ~[elasticsearch-7.5.2.jar:7.5.2]
... 4 more
Caused by: org.elasticsearch.index.snapshots.IndexShardRestoreFailedException: restore failed
at org.elasticsearch.index.shard.StoreRecovery.restore(StoreRecovery.java:480) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromRepository$5(StoreRecovery.java:285) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:308) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.StoreRecovery.recoverFromRepository(StoreRecovery.java:283) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.IndexShard.restoreFromRepository(IndexShard.java:1867) ~[elasticsearch-7.5.2.jar:7.5.2]
at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$17(IndexShard.java:2580) ~[elasticsearch-7.5.2.jar:7.5.2]
... 4 more
Caused by: org.elasticsearch.transport.ConnectTransportException: [][IP:9300] connect_timeout[30s]
at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:995) ~[elasticsearch-7.5.2.jar:7.5.2]
... 4 more
[2020-04-13T15:56:48,180][DEBUG][o.e.a.a.c.s.r.RestoreClusterStateListener] [DevMaster] restore of [_latest_/_latest_] completed