I have two nodes that have been out of the cluster for some time and I have just tried to bring them back in. Both fail in the same manner: connection timed out.
Tcpdump running on one of the cluster members shows the two machines communicating.
[2021-08-10T15:53:55,249][INFO ][o.e.c.c.JoinHelper ] [secesprd05] failed to join {secesprd01}{kAWPcpoxSNSN9WlUsYlQlg}{IZs_lY1dStmeuqmhsgWQOQ}{10.6.0.67}{10.6.0.67:9300}{cdhmw}{xpack.installed=true, molochtype=hot, transform.node=false} with JoinRequest{sourceNode={secesprd05}{4cPiEfloRoKgvx-NqVp4aA}{lhmrpJpNQhuS_0hhrGDR5g}{130.216.236.212}{130.216.236.212:9300}{cd}{xpack.installed=true, transform.node=false}, minimumTerm=32, optionalJoin=Optional.empty}
org.elasticsearch.transport.RemoteTransportException: [secesprd01][10.6.0.67:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.transport.ConnectTransportException: [secesprd05][130.216.236.212:9300] connect_exception
at org.elasticsearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:978) ~[elasticsearch-7.10.1.jar:7.10.1]
at org.elasticsearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:198) ~[elasticsearch-7.10.1.jar:7.10.1]
at org.elasticsearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:42) ~[elasticsearch-core-7.10.1.jar:7.10.1]
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2152) ~[?:?]
at org.elasticsearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:57) ~[elasticsearch-core-7.10.1.jar:7.10.1]
at org.elasticsearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:68) ~[transport-netty4-client-7.10.1.jar:7.10.1]
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:263) ~[netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: java.io.IOException: connection timed out: 130.216.236.212/130.216.236.212:9300
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) ~[netty-common-4.1.49.Final.jar:4.1.49.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:832) ~[?:?]
tcpdump shows:
15:53:46.258143 IP secesprd05.its.auckland.ac.nz.49094 > secesprd01.its.auckland.ac.nz.9300: Flags [P.], seq 567169938:567170919, ack 1195283306, win 501, options [nop,nop,TS val 2264656474 ecr 2515565298], length 981
0x0000: 4500 0409 b743 4000 3f06 06b6 82d8 ecd4 E....C@.?.......
0x0010: 0a06 0043 bfc6 2454 21ce 5392 473e 936a ...C..$T!.S.G>.j
0x0020: 8018 01f5 4bf7 0000 0101 080a 86fb ea5a ....K..........Z
0x0030: 95f0 7af2 4553 0000 03cf 0000 0000 0000 ..z.ES..........
0x0040: 10ad 0000 6c56 c300 0000 8b01 1e5f 7870 ....lV......._xp
0x0050: 6163 6b5f 7365 6375 7269 7479 5f61 7574 ack_security_aut
0x0060: 6865 6e74 6963 6174 696f 6e40 7736 3278 hentication@w62x
0x0070: 4177 4548 5833 4e35 6333 526c 6251 707a AwEHX3N5c3RlbQpz
0x0080: 5a57 4e6c 6333 4279 5a44 4131 4346 3966 ZWNlc3ByZDA1CF9f
0x0090: 5958 5230 5957 4e6f 4346 3966 5958 5230 YXR0YWNoCF9fYXR0
0x00a0: 5957 4e6f 4141 514b 4141 3d3d 0001 0678 YWNoAAQKAA==...x
0x00b0: 2d70 6163 6b20 696e 7465 726e 616c 3a64 -pack.internal:d
0x00c0: 6973 636f 7665 7279 2f72 6571 7565 7374 iscovery/request
0x00d0: 5f70 6565 7273 000a 7365 6365 7370 7264 _peers..secesprd
0x00e0: 3035 1634 6350 6945 666c 6f52 6f4b 6776 05.4cPiEfloRoKgv
0x00f0: 782d 4e71 5670 3461 4116 6c68 6d72 704a x-NqVp4aA.lhmrpJ
0x0100: 704e 5168 7553 5f30 6868 7247 4452 3567 pNQhuS_0hhrGDR5g
0x0110: 0f31 3330 2e32 3136 2e32 3336 2e32 3132 .130.216.236.212
0x0120: 0f31 3330 2e32 3136 2e32 3336 2e32 3132 .130.216.236.212
0x0130: 0482 d8ec d40f 3133 302e 3231 362e 3233 ......130.216.23
0x0140: 362e 3231 3200 0024 5402 0f78 7061 636b 6.212..$T..xpack
0x0150: 2e69 6e73 7461 6c6c 6564 0474 7275 650e .installed.true.
0x0160: 7472 616e 7366 6f72 6d2e 6e6f 6465 0566 transform.node.f
0x0170: 616c 7365 0204 6461 7461 0164 0109 6461 alse..data.d..da
0x0180: 7461 5f63 6f6c 6401 6301 a7ae b103 030a ta_cold.c.......
0x0190: 7365 6365 7370 7264 3031 166b 4157 5063 secesprd01.kAWPc
0x01a0: 706f 7853 4e53 4e39 576c 5573 596c 516c poxSNSN9WlUsYlQl
0x01b0: 6716 495a 735f 6c59 3164 5374 6d65 7571 g.IZs_lY1dStmeuq
0x01c0: 6d68 7367 5751 4f51 0931 302e 362e 302e mhsgWQOQ.10.6.0.
0x01d0: 3637 0931 302e 362e 302e 3637 040a 0600 67.10.6.0.67....
0x01e0: 4309 3130 2e36 2e30 2e36 3700 0024 5403 C.10.6.0.67..$T.
0x01f0: 0f78 7061 636b 2e69 6e73 7461 6c6c 6564 .xpack.installed
0x0200: 0474 7275 650a 6d6f 6c6f 6368 7479 7065 .true.molochtype
0x0210: 0368 6f74 0e74 7261 6e73 666f 726d 2e6e .hot.transform.n
0x0220: 6f64 6505 6661 6c73 6505 0464 6174 6101 ode.false..data.
0x0230: 6401 0964 6174 615f 636f 6c64 0163 0108 d..data_cold.c..
0x0240: 6461 7461 5f68 6f74 0168 0109 6461 7461 data_hot.h..data
0x0250: 5f77 6172 6d01 7701 066d 6173 7465 7201 _warm.w..master.
0x0260: 6d00 c3ad b103 0b73 6563 6d6f 6e70 7264 m......secmonprd
0x0270: 3037 1654 4e48 6c64 4779 4151 3532 734e 07.TNHldGyAQ52sN
0x0280: 6c49 6247 5062 674d 6716 724c 6359 4172 lIbGPbgMg.rLcYAr
0x0290: 4637 5359 654b 414a 4c31 6a42 6a43 3667 F7SYeKAJL1jBjC6g
0x02a0: 0d31 3330 2e32 3136 2e35 2e31 3131 0d31 .130.216.5.111.1
0x02b0: 3330 2e32 3136 2e35 2e31 3131 0482 d805 30.216.5.111....
0x02c0: 6f0d 3133 302e 3231 362e 352e 3131 3100 o.130.216.5.111.
0x02d0: 0024 5403 0f78 7061 636b 2e69 6e73 7461 .$T..xpack.insta
0x02e0: 6c6c 6564 0474 7275 650a 6d6f 6c6f 6368 lled.true.moloch
0x02f0: 7479 7065 0477 6172 6d0e 7472 616e 7366 type.warm.transf
0x0300: 6f72 6d2e 6e6f 6465 0566 616c 7365 0304 orm.node.false..
0x0310: 6461 7461 0164 0109 6461 7461 5f77 6172 data.d..data_war
0x0320: 6d01 7701 066d 6173 7465 7201 6d00 c3ad m.w..master.m...
0x0330: b103 0a73 6563 6573 7072 6430 3216 3655 ...secesprd02.6U
0x0340: 4461 674a 5732 5433 6557 4d2d 3050 514a DagJW2T3eWM-0PQJ
0x0350: 3072 4d41 1677 6230 6a6b 7a44 6b51 364b 0rMA.wb0jkzDkQ6K
0x0360: 3536 4f4e 5777 6b58 336b 4109 3130 2e36 56ONWwkX3kA.10.6
0x0370: 2e30 2e36 3809 3130 2e36 2e30 2e36 3804 .0.68.10.6.0.68.
0x0380: 0a06 0044 0931 302e 362e 302e 3638 0000 ...D.10.6.0.68..
0x0390: 2454 030f 7870 6163 6b2e 696e 7374 616c $T..xpack.instal
0x03a0: 6c65 6404 7472 7565 0a6d 6f6c 6f63 6874 led.true.molocht
0x03b0: 7970 6503 686f 740e 7472 616e 7366 6f72 ype.hot.transfor
0x03c0: 6d2e 6e6f 6465 0566 616c 7365 0504 6461 m.node.false..da
0x03d0: 7461 0164 0109 6461 7461 5f63 6f6c 6401 ta.d..data_cold.
0x03e0: 6301 0864 6174 615f 686f 7401 6801 0964 c..data_hot.h..d
0x03f0: 6174 615f 7761 726d 0177 0106 6d61 7374 ata_warm.w..mast
0x0400: 6572 016d 00c3 adb1 03 er.m.....
15:53:46.258352 IP secesprd01.its.auckland.ac.nz.9300 > secesprd05.its.auckland.ac.nz.49094: Flags [P.], seq 1:347, ack 981, win 501, options [nop,nop,TS val 2515567299 ecr 2264656474], length 346
0x0000: 4500 018e b83e 4000 4006 0736 0a06 0043 E....>@.@..6...C
0x0010: 82d8 ecd4 2454 bfc6 473e 936a 21ce 5767 ....$T..G>.j!.Wg
0x0020: 8018 01f5 7b76 0000 0101 080a 95f0 82c3 ....{v..........
0x0030: 86fb ea5a 4553 0000 0154 0000 0000 0000 ...ZES...T......
0x0040: 10ad 0100 6c56 c300 0000 6201 1e5f 7870 ....lV....b.._xp
0x0050: 6163 6b5f 7365 6375 7269 7479 5f61 7574 ack_security_aut
0x0060: 6865 6e74 6963 6174 696f 6e40 7736 3278 hentication@w62x
0x0070: 4177 4548 5833 4e35 6333 526c 6251 707a AwEHX3N5c3RlbQpz
0x0080: 5a57 4e6c 6333 4279 5a44 4131 4346 3966 ZWNlc3ByZDA1CF9f
0x0090: 5958 5230 5957 4e6f 4346 3966 5958 5230 YXR0YWNoCF9fYXR0
0x00a0: 5957 4e6f 4141 514b 4141 3d3d 0001 0a73 YWNoAAQKAA==...s
0x00b0: 6563 6573 7072 6430 3116 6b41 5750 6370 ecesprd01.kAWPcp
0x00c0: 6f78 534e 534e 3957 6c55 7359 6c51 6c67 oxSNSN9WlUsYlQlg
0x00d0: 1649 5a73 5f6c 5931 6453 746d 6575 716d .IZs_lY1dStmeuqm
0x00e0: 6873 6757 514f 5109 3130 2e36 2e30 2e36 hsgWQOQ.10.6.0.6
0x00f0: 3709 3130 2e36 2e30 2e36 3704 0a06 0043 7.10.6.0.67....C
0x0100: 0931 302e 362e 302e 3637 0000 2454 030f .10.6.0.67..$T..
0x0110: 7870 6163 6b2e 696e 7374 616c 6c65 6404 xpack.installed.
0x0120: 7472 7565 0a6d 6f6c 6f63 6874 7970 6503 true.molochtype.
0x0130: 686f 740e 7472 616e 7366 6f72 6d2e 6e6f hot.transform.no
0x0140: 6465 0566 616c 7365 0504 6461 7461 0164 de.false..data.d
0x0150: 0109 6461 7461 5f63 6f6c 6401 6301 0864 ..data_cold.c..d
0x0160: 6174 615f 686f 7401 6801 0964 6174 615f ata_hot.h..data_
0x0170: 7761 726d 0177 0106 6d61 7374 6572 016d warm.w..master.m
0x0180: 00c3 adb1 0300 0000 0000 0000 0020 ..............
15:53:46.258485 IP secesprd05.its.auckland.ac.nz.49094 > secesprd01.its.auckland.ac.nz.9300: Flags [.], ack 347, win 501, options [nop,nop,TS val 2264656474 ecr 2515567299], length 0
0x0000: 4500 0034 b744 4000 3f06 0a8a 82d8 ecd4 E..4.D@.?.......
0x0010: 0a06 0043 bfc6 2454 21ce 5767 473e 94c4 ...C..$T!.WgG>..
0x0020: 8010 01f5 3775 0000 0101 080a 86fb ea5a ....7u.........Z
0x0030: 95f0 82c3 ....
any ideas as to what is going on?