Elasticsearch-master election is taking too long (+30mins)

Hi,

We have a custom ELK stack deployed on IKS (Kubernetes Service @IBM) version 7.1.1 with HelmCharts. The setup is made up from 3x elasticsearch-master nodes, 2x elasticsearch-data nodes, 1x logstash pod for gathering logs and viewing with 1x kibana pod. The big issue is that whenever we do a redeploy of the whole helmchart it takes 30min+ for the master election process to finish.

We activated TRACE logging on the master nodes:

<logger.org.elasticsearch.transport: trace
logger.org.elasticsearch.gateway.MetaStateService: trace >

We see a lot of different logs, I will post some of them we thought are relevant.

// {"type": "server", "timestamp": "2019-10-18T13:02:13,674+0000", "level": "TRACE", "component": "o.e.t.T.tracer", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "[51][internal:cluster/coordination/join] received response from [{elasticsearch-master-1}{mOXnX6QwQDOeRDqsf4b3dg}{-k6TYqt-Sz2Vancs3JucbA}{172.30.188.87}{172.30.188.87:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]" }
{"type": "server", "timestamp": "2019-10-18T13:02:13,675+0000", "level": "INFO", "component": "o.e.c.c.JoinHelper", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "failed to join {elasticsearch-master-1}{mOXnX6QwQDOeRDqsf4b3dg}{-k6TYqt-Sz2Vancs3JucbA}{172.30.188.87}{172.30.188.87:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20} with JoinRequest{sourceNode={elasticsearch-master-1}{mOXnX6QwQDOeRDqsf4b3dg}{-k6TYqt-Sz2Vancs3JucbA}{172.30.188.87}{172.30.188.87:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}, optionalJoin=Optional[Join{term=814, lastAcceptedTerm=809, lastAcceptedVersion=513638, sourceNode={elasticsearch-master-1}{mOXnX6QwQDOeRDqsf4b3dg}{-k6TYqt-Sz2Vancs3JucbA}{172.30.188.87}{172.30.188.87:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}, targetNode={elasticsearch-master-1}{mOXnX6QwQDOeRDqsf4b3dg}{-k6TYqt-Sz2Vancs3JucbA}{172.30.188.87}{172.30.188.87:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}}]}" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [elasticsearch-master-1][172.30.188.87:9300][internal:cluster/coordination/join]",
"Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: received a newer join from {elasticsearch-master-1}{mOXnX6QwQDOeRDqsf4b3dg}{-k6TYqt-Sz2Vancs3JucbA}{172.30.188.87}{172.30.188.87:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}",
"at org.elasticsearch.cluster.coordination.JoinHelper$CandidateJoinAccumulator.handleJoinRequest(JoinHelper.java:451) [elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:512) [elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.cluster.coordination.Coordinator.handleJoinRequest(Coordinator.java:478) [elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$0(JoinHelper.java:124) [elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251) [x-pack-security-7.1.1.jar:7.1.1]",
//

//"stacktrace": ["org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: node is no longer master for term 817 while handling publication",
"at org.elasticsearch.cluster.coordination.Coordinator.publish(Coordinator.java:1013) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.cluster.service.MasterService.publish(MasterService.java:252) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:238) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142) org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) ~[elasticsearch-7.1.1.jar:7.1.1]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]",
"at java.lang.Thread.run(Thread.java:835) [?:?]"] }
{"type": "server", "timestamp": "2019-10-18T13:02:17,235+0000", "level": "TRACE", "component": "o.e.t.TransportLogger", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "Netty4TcpChannel{localAddress=/172.30.188.87:48396, remoteAddress=elasticsearch-master-headless/172.30.66.112:9300} [length: 58, request id: 94, type: request, version: 6.8.0, action: internal:tcp/handshake] WRITE: 58B" }
{"type": "server", "timestamp": "2019-10-18T13:02:17,235+0000", "level": "TRACE", "component": "o.e.t.n.ESLoggingHandler", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "[id: 0xe6aed23d, L:/172.30.188.87:48396 - R:elasticsearch-master-headless/172.30.66.112:9300] WRITE: 58B\n +-------------------------------------------------+\n | 0 1 2 3 4 5 6 7 8 9 a b c d e f |\n+--------+-------------------------------------------------+----------------+\n|00000000| 45 53 00 00 00 34 00 00 00 00 00 00 00 5e 08 00 |ES...4.......^..|\n|00000010| 5c c6 63 00 00 01 06 78 2d 70 61 63 6b 16 69 6e |\.c....x-pack.in|\n|00000020| 74 65 72 6e 61 6c 3a 74 63 70 2f 68 61 6e 64 73 |ternal:tcp/hands|\n|00000030| 68 61 6b 65 00 04 97 ef ab 03 |hake...... |\n+--------+-------------------------------------------------+----------------+" }
{"type": "server", "timestamp": "2019-10-18T13:02:17,236+0000", "level": "TRACE", "component": "o.e.t.n.ESLoggingHandler", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "[id: 0xe6aed23d, L:/172.30.188.87:48396 - R:elasticsearch-master-headless/172.30.66.112:9300] FLUSH" }
{"type": "server", "timestamp": "2019-10-18T13:02:17,244+0000", "level": "TRACE", "component": "o.e.t.n.ESLoggingHandler", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "[id: 0xe6aed23d, L:/172.30.188.87:48396 - R:elasticsearch-master-headless/172.30.66.112:9300] ACTIVE" }
{"type": "server", "timestamp": "2019-10-18T13:02:17,211+0000", "level": "TRACE", "component": "o.e.t.T.tracer", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "message": "[73][internal:cluster/coordination/join] sent error response" ,
"stacktrace": ["org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 817 does not match current term 818",
//

We can upload the full logs, please point us to a secure way to do that and preferably not public.

Thanks in advance,
Gergely Zoltan

4 Likes

Hi @zgergely, I'd be interested to see the logs. You can upload them privately here:

https://upload-staging.elastic.dev/u/c05f898d-0c29-4c0e-8fb5-4bac6c0019e9

I would also like you to add these trace loggers:

logger.org.elasticsearch.cluster.coordination: TRACE
logger.org.elasticsearch.discovery: TRACE

(edit: not sure whether to address you as Zoltan or Gergely, they're both given names, so I changed it to your username instead)

Hi @DavidTurner

Thx for the quick reply:)

Activated the TRACE logs you have mentioned and re-deployed the cluster (it took 50mins now for the election to complete.....)

Uploaded the logs:

elastic-master-2.log' https://upload-staging.elastic.dev/d/38ba3075e20c830392bba254b2e5a3361857d146b780a4f0f098770060bd305c

elastic-master-1.log' https://upload-staging.elastic.dev/d/ce0ba7142ef9aec2e95e4cfb0c9d78fa597d397b78e8f69f8c24afc63a6db4a4

elastic-master-0.log' https://upload-staging.elastic.dev/d/3706dc3e17dc57422660ff4bb5691c9f408c45d1a8663141d4b368608edc50a9

Looking forward for your analysis.

PS: Zoltan or @zgergely is fine anyways;))

Thank you,
Zoltan

1 Like

It looks like you disabled this logger? Picking an election essentially at random I see this:

{
  "type": "server",
  "timestamp": "2019-10-21T14:41:09,739+0000",
  "level": "TRACE",
  "component": "o.e.c.c.CoordinationState",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "message": "handlePublishRequest: accepting publish request for version [513965] and term [902]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T14:45:09,618+0000",
  "level": "TRACE",
  "component": "o.e.c.c.PreVoteCollector",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "message": "updating with preVoteResponse=PreVoteResponse{currentTerm=902, lastAcceptedTerm=902, lastAcceptedVersion=513965}, leader={elasticsearch-master-1}{AfaPrC0YSVSTPsN7CM5pOQ}{1lBni3SuRK2wHASbXGTZGQ}{172.30.188.100}{172.30.188.100:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}"
}

The only thing that happens between these two log lines is that the cluster state is written to disk. This is taking some 4 minutes to complete which is much much longer than expected. logger.org.elasticsearch.gateway.MetaStateService: TRACE will tell us what exactly it's doing in that time.

1 Like

Hi @DavidTurner

Thanks for the fast assessment. We use FileStorage (NFS) for the persistent volumes for the whole ELK deployment. Maybe I could boost the IOPS to see what happens.....

Nevertheless, I have activated also the TRACE you have lately mentioned. (took 50 mins to re-deploy again)

Uploaded the logs:

'elastic-master-0_take2.log' https://upload-staging.elastic.dev/d/b6117a1000b3e430e58418b4c9c9537cb06b0d1e5723e694630cceeab378d516

'elastic-master-0_take3.log' https://upload-staging.elastic.dev/d/f9fa396b391bd8eadf61bd58deaf0ecd7818a4dd74bd922db12f0b10ecefbe5e

'elastic-master-1_take2.log' https://upload-staging.elastic.dev/d/1713e1a6c3b461c5b723f2d05b82875e48dddbd1dd1a618ce0020775528149e0

'elastic-master-1_take3.log' https://upload-staging.elastic.dev/d/aefb97cd80e54a69bf97e72321533246aee270dba09acd20cf09b8712d7df662

'elastic-master-2_take2.log' https://upload-staging.elastic.dev/d/39d154759958b4e2913102a904a76756d0a80911699bc54941ace9c84cb12d84

'elastic-master-2_take3.log' https://upload-staging.elastic.dev/d/41a4230057f073093d5f92f053cb187e5ed63a1d3bbae40c0f9c37d65d6a445c

*The logs are divided in two parts for each node because I got an error from the K8S Cluster : <error: http2: server sent GOAWAY and closed the connection; LastStreamID=3, ErrCode=NO_ERROR, debug=""> not sure why....anyways, resumed afterwards.

**I accidentally overwrote 'elastic-master-2_take2.log' and the logs are not from the beginning, if needed let me know and I will redo the whole process.

Br,
Zoltan

Hi @DavidTurner

I have maxed out the IOPS settings for the FileStorage used for the elastic-master persistent volume claims to 10 IOPS from 2 IOPS (the initial level) and re-deployed the whole ELK helmchart from scratch. It took 14 minutes, definitely an improvement, but I guess this is not quite OK yet.

How much time will be acceptable for a well functioning elastic-master election on K8S to be finished?

Also, could we get rid off these persistent volume claims (FileStorage) and have the cluster state be written to the POD only (K8S worker - RAM) ?

Uploaded the logs for this take also:

'elastic-master-0_take4.log' https://upload-staging.elastic.dev/d/c0e02edeb8a6def5167bddab0d96cbd3a4ece06e1915e2cafb805741fb49f8bf

'elastic-master-1_take4.log' https://upload-staging.elastic.dev/d/02b68a4e09b16bebc72f98fab5fab5aa612a22dfc2d08dee981132ba98b6eed5

'elastic-master-2_take4.log' https://upload-staging.elastic.dev/d/3c68fac54ca982bf470633576f37c742feaf7573a8f2d076d21daac11b4ebe13

Thanks in advance,
Zoltan

That sounds very low, especially if you intend to index data into the cluster, which can be quite I/O intensive.

All data and master nodes require persistent storage.

Got them. Don't worry, there's no concept of "overwriting" in this upload service.

Your IO performance is dreadful:

{
  "type": "server",
  "timestamp": "2019-10-21T19:07:00,538+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.09.11/JkGSQ4cMSRmPi_p8plizGQ]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,094+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.09.11/JkGSQ4cMSRmPi_p8plizGQ]] state written"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,094+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.10.13/Uhq5bLsVR0C4UXRd5l4lZQ]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,467+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.10.13/Uhq5bLsVR0C4UXRd5l4lZQ]] state written"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,467+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.01/I3ZdbMx4SbKO_BnxAFqyYA]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,772+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.01/I3ZdbMx4SbKO_BnxAFqyYA]] state written"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,772+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.10/76TddLGuTHKpwx1TMggU8w]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:02,138+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.10/76TddLGuTHKpwx1TMggU8w]] state written"
}

That's the log for writing the metadata for just 4 indices which took from 19:07:00,538 to 19:07:02,138, i.e. 1600ms in total or an average of 400ms each. These metadata files are typically very small, a few kB at most, so should take nowhere near that long to write. Even a slow spinning disk should be able to durably write a small file in a few tens of milliseconds, and a SSD can normally easily beat 10ms. I think you need to work on improving this.

It might not be obvious why a relatively small difference in IO performance results in master elections taking the best part of an hour rather than a few hundred milliseconds. The trouble is that each election must write out all the index metadata, and concurrent elections typically fail. It normally takes hardly any time to write out all the index metadata and complete the election so the chance of two elections being concurrent is very low, but if it is taking many minutes to complete the election then they will almost always be concurrent. Elasticsearch keeps on trying until eventually one gets lucky and succeeds.

1 Like

Hi @DavidTurner,

Provisioned and configured 3x new PVs for the 3 elastic-master nodes of type Block Storage 20Gi and 1000 IOPS (meaning a total of 20000 IOPS for each of them if I am right)

Redeployed the whole ELK stack. It worked at the 1st attempt, master election was done after 99s

elasticsearch-master-1   2/2   Running   0     99s   172.30.188.83    <none>   <none>
elasticsearch-master-2   2/2   Running   0     103s   172.30.252.230    <none>   <none>
elasticsearch-master-1   1/2   Running   0     2m9s   172.30.188.83    <none>   <none>
elasticsearch-master-2   1/2   Running   0     2m13s   172.30.252.230   <none>   <none>

And after a couple of seconds it broke....as you can see:

{"type": "server", "timestamp": "2019-10-22T11:10:14,588+0000", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "elected-as-master ([2] nodes joined)[{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 6, version: 1, reason: master node changed {previous [], current [{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true},}"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,592+0000", "level": "TRACE", "component": "o.e.c.c.CoordinationState", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "handleClientValue: processing request for version [1] and term [6]"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,606+0000", "level": "TRACE", "component": "o.e.c.c.LeaderChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "setCurrentNodes: nodes: \n   {elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}\n   {elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}, local, master\n"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,607+0000", "level": "TRACE", "component": "o.e.c.c.FollowersChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "handleWakeUp: checking {elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} with FollowerCheckRequest{term=6, sender={elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}}"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,628+0000", "level": "TRACE", "component": "o.e.c.c.LeaderChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "handling LeaderCheckRequest{sender={elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}}"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,628+0000", "level": "TRACE", "component": "o.e.c.c.FollowersChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "FollowerChecker{discoveryNode={elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} check successful"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,627+0000", "level": "TRACE", "component": "o.e.c.c.C.CoordinatorPublication", "cluster.name": "elasticsearch", "node.name": "elasticsearch-

Above is a piece of log from successful election.

Did a separate attempt to re-deploy and it is not deploying anymore, at all.

Uploaded the logs for this take also:

'elastic-master-0_take5.log' https://upload-staging.elastic.dev/d/fedcf6ce3afae39b02eb6c7ff94d76433a2ca53027252f85378ac92796cde9af

'elastic-master-1_take5.log' https://upload-staging.elastic.dev/d/ec5b2ac20ad4d076a1067f9de51219d47c2a383608bf0c46fa687a6c6ec76c03

'elastic-master-2_take5.log' https://upload-staging.elastic.dev/d/bc44d5a381de0b2957fd68aea00a96f3464419d44c6f877afbad229cd0ff1595

There is a new error repeating:
Example:

{"type": "server", "timestamp": "2019-10-22T11:29:39,324+0000", "level": "TRACE", "component": "o.e.c.c.Coordinator", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "cluster.uuid": "saNE1pJcSQyuHAg7huMu7A", "node.id": "mNDX_1IKR0SW3knwJJHyCA",  "message": "handleJoinRequest: as LEADER, handling JoinRequest{sourceNode={elasticsearch-data-1}{W3cGpJFxTe-b3KcD_Lw85Q}{KWbSRxscTOidCb94QYpbug}{172.30.188.109}{172.30.188.109:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}, optionalJoin=Optional.empty}"  }
{"type": "server", "timestamp": "2019-10-22T11:29:39,374+0000", "level": "WARN", "component": "o.e.c.c.Coordinator", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "cluster.uuid": "saNE1pJcSQyuHAg7huMu7A", "node.id": "mNDX_1IKR0SW3knwJJHyCA",  "message": "failed to validate incoming join request from node [{elasticsearch-data-1}{W3cGpJFxTe-b3KcD_Lw85Q}{KWbSRxscTOidCb94QYpbug}{172.30.188.109}{172.30.188.109:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}]" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [elasticsearch-data-1][172.30.188.109:9300][internal:cluster/coordination/join/validate]",
"Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid saNE1pJcSQyuHAg7huMu7A than local cluster uuid 2KX45kJLQUm3U9P4kfeZgg, rejecting",

PS: Please be aware that the new storage was blank, maybe some steps had to be done before we deployed and we skipped, just an idea...

Thanks in advance,
Zoltan

The cluster UUID is set (randomly) the first time the cluster forms, and is stored on disk so it persists across restarts. This message tells us that there are (or have been) two clusters with two different IDs. Make sure you really did start with blank storage, and also make sure you are following the bootstrapping instructions, particularly this:

WARNING: You must set cluster.initial_master_nodes to the same list of nodes on each node on which it is set in order to be sure that only a single cluster forms during bootstrapping and therefore to avoid the risk of data loss.

The initial discovery and election process took about 4 seconds out of that time:

$ grep elastic-master-1_take5.log -e 'starting \.\.\.\|master node changed'
{"type": "server", "timestamp": "2019-10-22T11:40:59,502+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "starting ..."  }
{"type": "server", "timestamp": "2019-10-22T11:41:03,172+0000", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "elected-as-master ([2] nodes joined)[{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{fs5oHewwTTKOWchdZ2h8qQ}{172.30.252.247}{172.30.252.247:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} elect leader, {elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{zu2ROiucQJ2NXMzsq-wjQQ}{172.30.188.107}{172.30.188.107:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 10, version: 19, reason: master node changed {previous [], current [{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{zu2ROiucQJ2NXMzsq-wjQQ}{172.30.188.107}{172.30.188.107:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{fs5oHewwTTKOWchdZ2h8qQ}{172.30.252.247}{172.30.252.247:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true},}"  }
{"type": "server", "timestamp": "2019-10-22T11:41:03,726+0000", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "master node changed {previous [], current [{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{zu2ROiucQJ2NXMzsq-wjQQ}{172.30.188.107}{172.30.188.107:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{fs5oHewwTTKOWchdZ2h8qQ}{172.30.252.247}{172.30.252.247:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true},}, term: 10, version: 19, reason: Publication{term=10, version=19}"  }
1 Like

Hi @DavidTurner

Wiped clean the storage and did another re-deploy attempt.
Same outcome @ K8S level, PODs become 2/2 ready => election is completed and then it is failing...

elasticsearch-master-1   2/2   Running   0     92s   172.30.188.112      <none>   <none>
elasticsearch-master-0   2/2   Running   0     94s   172.30.66.115      <none>   <none>
elasticsearch-master-2   2/2   Running   0     96s   172.30.252.255      <none>   <none>
elasticsearch-master-1   1/2   Running   0     2m1s   172.30.188.112      <none>   <none>
elasticsearch-master-0   1/2   Running   0     2m3s   172.30.66.115      <none>   <none>
elasticsearch-master-2   1/2   Running   0     2m6s   172.30.252.255      <none>   <none>

For the cluster.initial_master_nodes we use elasticsearch-master-headless service.

NAME                            TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)             AGE
elasticsearch-master            ClusterIP      172.21.97.125   <none>          9200/TCP,9300/TCP   62m
elasticsearch-master-headless   ClusterIP      None            <none>          9200/TCP,9300/TCP   62m


# kubectl describe svc elasticsearch-master-headless
Name:              elasticsearch-master-headless
Namespace:         default
Labels:            app=elasticsearch-master
                   chart=elasticsearch-master-7.1.1
                   heritage=Tiller
                   release=
Annotations:       service.alpha.kubernetes.io/tolerate-unready-endpoints: true
Selector:          app=elasticsearch-master
Type:              ClusterIP
IP:                None
Port:              http  9200/TCP
TargetPort:        9200/TCP
Endpoints:         172.30.188.112:9200,172.30.252.255:9200,172.30.66.115:9200
Port:              transport  9300/TCP
TargetPort:        9300/TCP
Endpoints:         172.30.188.112:9300,172.30.252.255:9300,172.30.66.115:9300
Session Affinity:  None
Events:            <none>

# kubectl describe svc elasticsearch-master
Name:              elasticsearch-master
Namespace:         default
Labels:            <none>
Annotations:       <none>
Selector:          app=elasticsearch-master,chart=elasticsearch-master-7.1.1,heritage=Tiller,release=
Type:              ClusterIP
IP:                172.21.97.125
Port:              http  9200/TCP
TargetPort:        9200/TCP
Endpoints:
Port:              transport  9300/TCP
TargetPort:        9300/TCP
Endpoints:
Session Affinity:  None
Events:            <none>

Ah, just to mention we use: minimumMasterNodes: 2
- name: discovery.zen.minimum_master_nodes value: "{{ .Values.minimumMasterNodes }}"

Uploaded the logs also from this attempt:

'elastic-master-0_take6.log' https://upload-staging.elastic.dev/d/0094a38d467ada2386df466aee81756c67bd2b6ee4b7e07a2abcb50e9f22b9fe

'elastic-master-1_take6.log' https://upload-staging.elastic.dev/d/ebb01d7a06a06fd15b18691554b29b02a8dfd9445a708b1c186bac7476f24ea6

'elastic-master-2_take6.log' https://upload-staging.elastic.dev/d/a1c82e202dd1c643d4cc5f90ab5c04bf7963480af6563087b2af5aba3ae16fd8

Thank you,
Zoltan

I believe discovery.zen.minimum_master_nodes will not actually do anything anymore. As @DavidTurner mentioned, maybe there are other parts of your bootstrapping configuration that are not valid anymore?

2 Likes

Yes, this does nothing in 7.x.

The logs you shared indicate that the master nodes successfully formed a new cluster, but the data nodes cannot join them since they belong to a different cluster. In other words, it looks like you have not wiped clean the storage on the data nodes.:

failed to validate incoming join request from node [{elasticsearch-data-1}{W3cGpJFxTe-b3KcD_Lw85Q}{sijajRnkT6qpGDc0ptIkmw}{172.30.188.99}{172.30.188.99:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}]" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [elasticsearch-data-1][172.30.188.99:9300][internal:cluster/coordination/join/validate]",
"Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 1JDq1yWaRBGgz0QoTjlKew than local cluster uuid 2KX45kJLQUm3U9P4kfeZgg, rejecting",

The data node belongs to an earlier cluster with UUID 2KX45kJLQUm3U9P4kfeZgg whereas the master nodes just formed a cluster with UUID 1JDq1yWaRBGgz0QoTjlKew. You cannot move nodes between clusters like this.

1 Like

Hi @DavidTurner,

Yes, wiped also the elastic-data nodes storage, re-deployed successfully. Finally, it works! :partying_face:
But I have deleted also all the indices and all data from the "nodes" directory.(not important now, but it may in the future...) Is there another way to delete only the cluster UUID info and keep all the indices/useful data?

Regarding the storage for the elastic-data nodes, should we migrate to similar storage as for the elastic-master nodes?
elastic-master nodes NOW: 3x20Gi with 1000 IOPS (20000 IOPS)
elstic-data nodes NOW: 2x150Gi with 10 IOPS (1500IOPS)

Also, can you point us to some official working ELK 7.1.1 deployments templates on Kubernetes with/or without HelmCharts so we could compare ours and fix it/tune it accordingly?

Thanks a lot for your effort and assistance in our struggle!

Br,
Zoltan

This is not necessary and would be deeply unsafe. The cluster UUID is intimately linked to the "useful" data - you can't lose one without the other.

1 Like

Thx for the info @DavidTurner

Awaiting a reply when you have the spare time on my other 2 questions from my previous reply.

Thank you,
Zoltan

It depends on your workload. We generally recommend against network-attached storage entirely, whether your focus is search speed or indexing speed. Aside from that, as long as your cluster is stable under load then your disks are fast enough.

Is Elastic Cloud for Kubernetes what you are looking for?

Hi @DavidTurner,

Reproduced the whole scenario in a clone environment of our previous ELK deployment with 500G of indices data in the 2x elastic-data nodes.
Provisioned BlockStorage with performance IOPS for all the storage of the master and data nodes. Copied all the data from the previous FileStorages so there should no UUID issues.
The master election fails, the cluster is not becoming ready at all...:neutral_face:

Attaching the logs:

'elastic-master-0_take7.log' https://upload-staging.elastic.dev/d/3cd81f6778902134c18b2bb360607d82902ee2db3279ae770bdae86ce66686c4

'elastic-master-1_take7.log' https://upload-staging.elastic.dev/d/35aeee30fc670d148b74927cf7cdb8e6a1e454d5688e1a2e446187125bc16884

Please advise.
Thx,
Zoltan

All the extra TRACE logging is perhaps no longer necessary and making it harder to see the problem which is described in this WARN-level message:

have discovered [
{elasticsearch-master-2}{AfaPrC0YSVSTPsN7CM5pOQ}{5IwZxI7bRoCOpNxQt2oBlw}{172.30.170.214}{172.30.170.214:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}, 
{elasticsearch-master-1}{AfaPrC0YSVSTPsN7CM5pOQ}{tld0QAThRVq_2dUGfpiN9A}{172.30.239.41}{172.30.239.41:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}
]
discovery will continue using [172.30.239.41:9300, 172.30.170.214:9300] from hosts providers and [
{elasticsearch-master-0}{AfaPrC0YSVSTPsN7CM5pOQ}{sqeGfhpJQ9-lIC0wyLl8dw}{172.30.205.23}{172.30.205.23:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 611, last-accepted version 217886 in term 611"

Your three nodes all have the same node ID which means you must have copied their data paths. Don't do this: instead, start each node with an empty data path.

You should also address these warnings because they are adding a lot of noise:

{"type": "server", "timestamp": "2019-10-24T17:31:22,061+0000", "level": "WARN", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=0.0.0.0/0.0.0.0:9200, remoteAddress=/127.0.0.1:57350}" ,
"stacktrace": ["io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 474554202f...",

This is caused by an external agent called Sysdig Agent/1.0 trying to get information about the cluster over HTTP rather than HTTPS.

2 Likes