Today, I tried to upgrade my 5 nodes cluster from 6.6.1 to 6.8.
I planned to apply the rolling upgrade procedure I found here:
https://www.elastic.co/guide/en/elasticsearch/reference/6.8/rolling-upgrades.html
I then disabled the shards allocation and performed a synced flush.
But after stopping the first node, my cluster' status turned in RED not in yellow.
To avoid any problems, I re-started the stopped node.
After checking my logs on all my nodes, I found nothing special except a strange error message on one of them.
After notifying the index shard primary-replica resync completed, I saw a message displaying "global checkpoint sync failed". See below.
Any idea on what happened ?
Regards,
Jean-Marc
=====
[2019-11-07T09:13:36,774][INFO ][o.e.i.s.IndexShard ] [el8023.bc] [workflow-job-2019.10.08][3] primary-replica resync completed with 0 operations
[...]
[2019-11-07T09:13:36,783][INFO ][o.e.i.s.GlobalCheckpointSyncAction] [el8023.bc] [workflow-job-2019.10.08][3] global checkpoint sync failed
org.elasticsearch.transport.RemoteTransportException: [el8024.bc][10.120.120.37:9300][indices:admin/seq_no/global_checkpoint_sync]
Caused by: org.elasticsearch.transport.SendRequestTransportException: [el8024.bc][10.120.120.37:9300][indices:admin/seq_no/global_checkpoint_sync[p]]
at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:639) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$1.sendRequest(SecurityServerTransportInterceptor.java:136) ~[?:?]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:542) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:530) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.performAction(TransportReplicationAction.java:873) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.performLocalAction(TransportReplicationAction.java:824) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:811) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction.doExecute(TransportReplicationAction.java:172) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.replication.TransportReplicationAction.doExecute(TransportReplicationAction.java:100) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:124) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.6.0.jar:6.6.0]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.6.0.jar:6.6.0]