Elasticsearch-master election is taking too long (+30mins)

Hi @DavidTurner

I have maxed out the IOPS settings for the FileStorage used for the elastic-master persistent volume claims to 10 IOPS from 2 IOPS (the initial level) and re-deployed the whole ELK helmchart from scratch. It took 14 minutes, definitely an improvement, but I guess this is not quite OK yet.

How much time will be acceptable for a well functioning elastic-master election on K8S to be finished?

Also, could we get rid off these persistent volume claims (FileStorage) and have the cluster state be written to the POD only (K8S worker - RAM) ?

Uploaded the logs for this take also:

'elastic-master-0_take4.log' https://upload-staging.elastic.dev/d/c0e02edeb8a6def5167bddab0d96cbd3a4ece06e1915e2cafb805741fb49f8bf

'elastic-master-1_take4.log' https://upload-staging.elastic.dev/d/02b68a4e09b16bebc72f98fab5fab5aa612a22dfc2d08dee981132ba98b6eed5

'elastic-master-2_take4.log' https://upload-staging.elastic.dev/d/3c68fac54ca982bf470633576f37c742feaf7573a8f2d076d21daac11b4ebe13

Thanks in advance,
Zoltan

That sounds very low, especially if you intend to index data into the cluster, which can be quite I/O intensive.

All data and master nodes require persistent storage.

Got them. Don't worry, there's no concept of "overwriting" in this upload service.

Your IO performance is dreadful:

{
  "type": "server",
  "timestamp": "2019-10-21T19:07:00,538+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.09.11/JkGSQ4cMSRmPi_p8plizGQ]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,094+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.09.11/JkGSQ4cMSRmPi_p8plizGQ]] state written"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,094+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.10.13/Uhq5bLsVR0C4UXRd5l4lZQ]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,467+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.10.13/Uhq5bLsVR0C4UXRd5l4lZQ]] state written"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,467+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.01/I3ZdbMx4SbKO_BnxAFqyYA]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,772+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.01/I3ZdbMx4SbKO_BnxAFqyYA]] state written"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:01,772+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.10/76TddLGuTHKpwx1TMggU8w]] writing state, reason [freshly created]"
}
{
  "type": "server",
  "timestamp": "2019-10-21T19:07:02,138+0000",
  "level": "TRACE",
  "component": "o.e.g.MetaStateService",
  "cluster.name": "elasticsearch",
  "node.name": "elasticsearch-master-0",
  "cluster.uuid": "2KX45kJLQUm3U9P4kfeZgg",
  "node.id": "HgRBrqoTQqq_sUTFHbbYYw",
  "message": "[[logstash-logs-2019.08.10/76TddLGuTHKpwx1TMggU8w]] state written"
}

That's the log for writing the metadata for just 4 indices which took from 19:07:00,538 to 19:07:02,138, i.e. 1600ms in total or an average of 400ms each. These metadata files are typically very small, a few kB at most, so should take nowhere near that long to write. Even a slow spinning disk should be able to durably write a small file in a few tens of milliseconds, and a SSD can normally easily beat 10ms. I think you need to work on improving this.

It might not be obvious why a relatively small difference in IO performance results in master elections taking the best part of an hour rather than a few hundred milliseconds. The trouble is that each election must write out all the index metadata, and concurrent elections typically fail. It normally takes hardly any time to write out all the index metadata and complete the election so the chance of two elections being concurrent is very low, but if it is taking many minutes to complete the election then they will almost always be concurrent. Elasticsearch keeps on trying until eventually one gets lucky and succeeds.

1 Like

Hi @DavidTurner,

Provisioned and configured 3x new PVs for the 3 elastic-master nodes of type Block Storage 20Gi and 1000 IOPS (meaning a total of 20000 IOPS for each of them if I am right)

Redeployed the whole ELK stack. It worked at the 1st attempt, master election was done after 99s

elasticsearch-master-1   2/2   Running   0     99s   172.30.188.83    <none>   <none>
elasticsearch-master-2   2/2   Running   0     103s   172.30.252.230    <none>   <none>
elasticsearch-master-1   1/2   Running   0     2m9s   172.30.188.83    <none>   <none>
elasticsearch-master-2   1/2   Running   0     2m13s   172.30.252.230   <none>   <none>

And after a couple of seconds it broke....as you can see:

{"type": "server", "timestamp": "2019-10-22T11:10:14,588+0000", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "elected-as-master ([2] nodes joined)[{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20} elect leader, {elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 6, version: 1, reason: master node changed {previous [], current [{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true},}"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,592+0000", "level": "TRACE", "component": "o.e.c.c.CoordinationState", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "handleClientValue: processing request for version [1] and term [6]"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,606+0000", "level": "TRACE", "component": "o.e.c.c.LeaderChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "setCurrentNodes: nodes: \n   {elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}\n   {elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}, local, master\n"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,607+0000", "level": "TRACE", "component": "o.e.c.c.FollowersChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "handleWakeUp: checking {elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} with FollowerCheckRequest{term=6, sender={elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{4AmiGR6wSvSA5HFDdbSU5Q}{172.30.188.83}{172.30.188.83:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}}"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,628+0000", "level": "TRACE", "component": "o.e.c.c.LeaderChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "handling LeaderCheckRequest{sender={elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}}"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,628+0000", "level": "TRACE", "component": "o.e.c.c.FollowersChecker", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "FollowerChecker{discoveryNode={elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{2191vwLgRtW6WRKwzQqupg}{172.30.252.230}{172.30.252.230:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}, failureCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3} check successful"  }
{"type": "server", "timestamp": "2019-10-22T11:10:14,627+0000", "level": "TRACE", "component": "o.e.c.c.C.CoordinatorPublication", "cluster.name": "elasticsearch", "node.name": "elasticsearch-

Above is a piece of log from successful election.

Did a separate attempt to re-deploy and it is not deploying anymore, at all.

Uploaded the logs for this take also:

'elastic-master-0_take5.log' https://upload-staging.elastic.dev/d/fedcf6ce3afae39b02eb6c7ff94d76433a2ca53027252f85378ac92796cde9af

'elastic-master-1_take5.log' https://upload-staging.elastic.dev/d/ec5b2ac20ad4d076a1067f9de51219d47c2a383608bf0c46fa687a6c6ec76c03

'elastic-master-2_take5.log' https://upload-staging.elastic.dev/d/bc44d5a381de0b2957fd68aea00a96f3464419d44c6f877afbad229cd0ff1595

There is a new error repeating:
Example:

{"type": "server", "timestamp": "2019-10-22T11:29:39,324+0000", "level": "TRACE", "component": "o.e.c.c.Coordinator", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "cluster.uuid": "saNE1pJcSQyuHAg7huMu7A", "node.id": "mNDX_1IKR0SW3knwJJHyCA",  "message": "handleJoinRequest: as LEADER, handling JoinRequest{sourceNode={elasticsearch-data-1}{W3cGpJFxTe-b3KcD_Lw85Q}{KWbSRxscTOidCb94QYpbug}{172.30.188.109}{172.30.188.109:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}, optionalJoin=Optional.empty}"  }
{"type": "server", "timestamp": "2019-10-22T11:29:39,374+0000", "level": "WARN", "component": "o.e.c.c.Coordinator", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1", "cluster.uuid": "saNE1pJcSQyuHAg7huMu7A", "node.id": "mNDX_1IKR0SW3knwJJHyCA",  "message": "failed to validate incoming join request from node [{elasticsearch-data-1}{W3cGpJFxTe-b3KcD_Lw85Q}{KWbSRxscTOidCb94QYpbug}{172.30.188.109}{172.30.188.109:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}]" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [elasticsearch-data-1][172.30.188.109:9300][internal:cluster/coordination/join/validate]",
"Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid saNE1pJcSQyuHAg7huMu7A than local cluster uuid 2KX45kJLQUm3U9P4kfeZgg, rejecting",

PS: Please be aware that the new storage was blank, maybe some steps had to be done before we deployed and we skipped, just an idea...

Thanks in advance,
Zoltan

The cluster UUID is set (randomly) the first time the cluster forms, and is stored on disk so it persists across restarts. This message tells us that there are (or have been) two clusters with two different IDs. Make sure you really did start with blank storage, and also make sure you are following the bootstrapping instructions, particularly this:

WARNING: You must set cluster.initial_master_nodes to the same list of nodes on each node on which it is set in order to be sure that only a single cluster forms during bootstrapping and therefore to avoid the risk of data loss.

The initial discovery and election process took about 4 seconds out of that time:

$ grep elastic-master-1_take5.log -e 'starting \.\.\.\|master node changed'
{"type": "server", "timestamp": "2019-10-22T11:40:59,502+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "starting ..."  }
{"type": "server", "timestamp": "2019-10-22T11:41:03,172+0000", "level": "INFO", "component": "o.e.c.s.MasterService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "elected-as-master ([2] nodes joined)[{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{fs5oHewwTTKOWchdZ2h8qQ}{172.30.252.247}{172.30.252.247:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} elect leader, {elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{zu2ROiucQJ2NXMzsq-wjQQ}{172.30.188.107}{172.30.188.107:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 10, version: 19, reason: master node changed {previous [], current [{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{zu2ROiucQJ2NXMzsq-wjQQ}{172.30.188.107}{172.30.188.107:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{fs5oHewwTTKOWchdZ2h8qQ}{172.30.252.247}{172.30.252.247:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true},}"  }
{"type": "server", "timestamp": "2019-10-22T11:41:03,726+0000", "level": "INFO", "component": "o.e.c.s.ClusterApplierService", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "master node changed {previous [], current [{elasticsearch-master-1}{mNDX_1IKR0SW3knwJJHyCA}{zu2ROiucQJ2NXMzsq-wjQQ}{172.30.188.107}{172.30.188.107:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}]}, added {{elasticsearch-master-2}{3360Dk1sSNi2hYekfYRP-A}{fs5oHewwTTKOWchdZ2h8qQ}{172.30.252.247}{172.30.252.247:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true},}, term: 10, version: 19, reason: Publication{term=10, version=19}"  }
1 Like

Hi @DavidTurner

Wiped clean the storage and did another re-deploy attempt.
Same outcome @ K8S level, PODs become 2/2 ready => election is completed and then it is failing...

elasticsearch-master-1   2/2   Running   0     92s   172.30.188.112      <none>   <none>
elasticsearch-master-0   2/2   Running   0     94s   172.30.66.115      <none>   <none>
elasticsearch-master-2   2/2   Running   0     96s   172.30.252.255      <none>   <none>
elasticsearch-master-1   1/2   Running   0     2m1s   172.30.188.112      <none>   <none>
elasticsearch-master-0   1/2   Running   0     2m3s   172.30.66.115      <none>   <none>
elasticsearch-master-2   1/2   Running   0     2m6s   172.30.252.255      <none>   <none>

For the cluster.initial_master_nodes we use elasticsearch-master-headless service.

NAME                            TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)             AGE
elasticsearch-master            ClusterIP      172.21.97.125   <none>          9200/TCP,9300/TCP   62m
elasticsearch-master-headless   ClusterIP      None            <none>          9200/TCP,9300/TCP   62m


# kubectl describe svc elasticsearch-master-headless
Name:              elasticsearch-master-headless
Namespace:         default
Labels:            app=elasticsearch-master
                   chart=elasticsearch-master-7.1.1
                   heritage=Tiller
                   release=
Annotations:       service.alpha.kubernetes.io/tolerate-unready-endpoints: true
Selector:          app=elasticsearch-master
Type:              ClusterIP
IP:                None
Port:              http  9200/TCP
TargetPort:        9200/TCP
Endpoints:         172.30.188.112:9200,172.30.252.255:9200,172.30.66.115:9200
Port:              transport  9300/TCP
TargetPort:        9300/TCP
Endpoints:         172.30.188.112:9300,172.30.252.255:9300,172.30.66.115:9300
Session Affinity:  None
Events:            <none>

# kubectl describe svc elasticsearch-master
Name:              elasticsearch-master
Namespace:         default
Labels:            <none>
Annotations:       <none>
Selector:          app=elasticsearch-master,chart=elasticsearch-master-7.1.1,heritage=Tiller,release=
Type:              ClusterIP
IP:                172.21.97.125
Port:              http  9200/TCP
TargetPort:        9200/TCP
Endpoints:
Port:              transport  9300/TCP
TargetPort:        9300/TCP
Endpoints:
Session Affinity:  None
Events:            <none>

Ah, just to mention we use: minimumMasterNodes: 2
- name: discovery.zen.minimum_master_nodes value: "{{ .Values.minimumMasterNodes }}"

Uploaded the logs also from this attempt:

'elastic-master-0_take6.log' https://upload-staging.elastic.dev/d/0094a38d467ada2386df466aee81756c67bd2b6ee4b7e07a2abcb50e9f22b9fe

'elastic-master-1_take6.log' https://upload-staging.elastic.dev/d/ebb01d7a06a06fd15b18691554b29b02a8dfd9445a708b1c186bac7476f24ea6

'elastic-master-2_take6.log' https://upload-staging.elastic.dev/d/a1c82e202dd1c643d4cc5f90ab5c04bf7963480af6563087b2af5aba3ae16fd8

Thank you,
Zoltan

I believe discovery.zen.minimum_master_nodes will not actually do anything anymore. As @DavidTurner mentioned, maybe there are other parts of your bootstrapping configuration that are not valid anymore?

2 Likes

Yes, this does nothing in 7.x.

The logs you shared indicate that the master nodes successfully formed a new cluster, but the data nodes cannot join them since they belong to a different cluster. In other words, it looks like you have not wiped clean the storage on the data nodes.:

failed to validate incoming join request from node [{elasticsearch-data-1}{W3cGpJFxTe-b3KcD_Lw85Q}{sijajRnkT6qpGDc0ptIkmw}{172.30.188.99}{172.30.188.99:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}]" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [elasticsearch-data-1][172.30.188.99:9300][internal:cluster/coordination/join/validate]",
"Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid 1JDq1yWaRBGgz0QoTjlKew than local cluster uuid 2KX45kJLQUm3U9P4kfeZgg, rejecting",

The data node belongs to an earlier cluster with UUID 2KX45kJLQUm3U9P4kfeZgg whereas the master nodes just formed a cluster with UUID 1JDq1yWaRBGgz0QoTjlKew. You cannot move nodes between clusters like this.

1 Like

Hi @DavidTurner,

Yes, wiped also the elastic-data nodes storage, re-deployed successfully. Finally, it works! :partying_face:
But I have deleted also all the indices and all data from the "nodes" directory.(not important now, but it may in the future...) Is there another way to delete only the cluster UUID info and keep all the indices/useful data?

Regarding the storage for the elastic-data nodes, should we migrate to similar storage as for the elastic-master nodes?
elastic-master nodes NOW: 3x20Gi with 1000 IOPS (20000 IOPS)
elstic-data nodes NOW: 2x150Gi with 10 IOPS (1500IOPS)

Also, can you point us to some official working ELK 7.1.1 deployments templates on Kubernetes with/or without HelmCharts so we could compare ours and fix it/tune it accordingly?

Thanks a lot for your effort and assistance in our struggle!

Br,
Zoltan

This is not necessary and would be deeply unsafe. The cluster UUID is intimately linked to the "useful" data - you can't lose one without the other.

1 Like

Thx for the info @DavidTurner

Awaiting a reply when you have the spare time on my other 2 questions from my previous reply.

Thank you,
Zoltan

It depends on your workload. We generally recommend against network-attached storage entirely, whether your focus is search speed or indexing speed. Aside from that, as long as your cluster is stable under load then your disks are fast enough.

Is Elastic Cloud for Kubernetes what you are looking for?

Hi @DavidTurner,

Reproduced the whole scenario in a clone environment of our previous ELK deployment with 500G of indices data in the 2x elastic-data nodes.
Provisioned BlockStorage with performance IOPS for all the storage of the master and data nodes. Copied all the data from the previous FileStorages so there should no UUID issues.
The master election fails, the cluster is not becoming ready at all...:neutral_face:

Attaching the logs:

'elastic-master-0_take7.log' https://upload-staging.elastic.dev/d/3cd81f6778902134c18b2bb360607d82902ee2db3279ae770bdae86ce66686c4

'elastic-master-1_take7.log' https://upload-staging.elastic.dev/d/35aeee30fc670d148b74927cf7cdb8e6a1e454d5688e1a2e446187125bc16884

Please advise.
Thx,
Zoltan

All the extra TRACE logging is perhaps no longer necessary and making it harder to see the problem which is described in this WARN-level message:

have discovered [
{elasticsearch-master-2}{AfaPrC0YSVSTPsN7CM5pOQ}{5IwZxI7bRoCOpNxQt2oBlw}{172.30.170.214}{172.30.170.214:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}, 
{elasticsearch-master-1}{AfaPrC0YSVSTPsN7CM5pOQ}{tld0QAThRVq_2dUGfpiN9A}{172.30.239.41}{172.30.239.41:9300}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true}
]
discovery will continue using [172.30.239.41:9300, 172.30.170.214:9300] from hosts providers and [
{elasticsearch-master-0}{AfaPrC0YSVSTPsN7CM5pOQ}{sqeGfhpJQ9-lIC0wyLl8dw}{172.30.205.23}{172.30.205.23:9300}{ml.machine_memory=2147483648, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 611, last-accepted version 217886 in term 611"

Your three nodes all have the same node ID which means you must have copied their data paths. Don't do this: instead, start each node with an empty data path.

You should also address these warnings because they are adding a lot of noise:

{"type": "server", "timestamp": "2019-10-24T17:31:22,061+0000", "level": "WARN", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "elasticsearch", "node.name": "elasticsearch-master-1",  "message": "caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=0.0.0.0/0.0.0.0:9200, remoteAddress=/127.0.0.1:57350}" ,
"stacktrace": ["io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 474554202f...",

This is caused by an external agent called Sysdig Agent/1.0 trying to get information about the cluster over HTTP rather than HTTPS.

2 Likes

Thx @DavidTurner

Will do a test right now. But how can I bring data to my data nodes so I could test this in our usual use scenario, not when the data nodes are empty? cause that way it is working...

Br,
Zoltan

It's a bad idea to restore a cluster by copying or cloning data at the filesystem level as you seem to be doing. You should use snapshot and restore to import indices into an empty cluster instead.

Hi @DavidTurner,

Managed to start the cluster with a workaround I guess, I will detail below:

Looks like from all the previous re-deploys of the same cluster the data paths of each elastic-master node was looking like this

root@test-pod-m1:/data/old_m1# cd nodes/

root@test-pod-m1:/data/old_m1/nodes# ls -latr
total 20
drwxr-xr-x 4 1000       1000 4096 Jun 24 12:39 ..
drwxrwxr-x 4 1000 4294967294 4096 Oct 14 12:51 0
drwxrwxr-x 4 1000 4294967294 4096 Oct 14 12:51 1
drwxrwxr-x 4 1000 4294967294 4096 Oct 14 12:51 2
drwxrwxr-x 5 1000 4294967294 4096 Oct 25 10:18 .

=> containing all the possible 3 UUIDs of the cluster

root@test-pod-m1:/data/old_m1/nodes/0/_state# ls -latr
total 48
-rw-rw-r-- 1 1000 4294967294    71 Oct 14 12:47 node-291.st
drwxrwxr-x 4 1000 4294967294  4096 Oct 14 12:51 ..
-rw-rw-r-- 1 1000 4294967294 23456 Oct 14 13:02 global-176.st
-rw-rw-r-- 1 1000 4294967294  6863 Oct 23 00:00 manifest-74201.st
drwxrwxr-x 2 1000 4294967294  4096 Oct 25 10:19 .

root@test-pod-m1:/data/old_m1/nodes/0/_state# cat 0/_state/node-291.st
?▒lstate:)
▒node_idUAfaPrC0YSVSTPsN7CM5pOQ▒▒(▒▒▒+▒


root@test-pod-m1:/data/old_m1/nodes/1/_state# ls -latr
total 48
-rw-rw-r-- 1 1000 4294967294    71 Oct 14 12:47 node-119.st
drwxrwxr-x 4 1000 4294967294  4096 Oct 14 12:51 ..
-rw-rw-r-- 1 1000 4294967294 23462 Oct 14 13:02 global-175.st
-rw-rw-r-- 1 1000 4294967294  6864 Oct 23 00:00 manifest-218378.st
drwxrwxr-x 2 1000 4294967294  4096 Oct 23 00:00 .

root@test-pod-m1:/data/old_m1/nodes/1/_state# cat 1/_state/node-119.st
?▒lstate:)
▒node_idUmOXnX6QwQDOeRDqsf4b3dg▒▒(▒▒▒▒


root@test-pod-m1:/data/old_m1/nodes/2/_state# ls -latr
total 48
-rw-rw-r-- 1 1000 4294967294    71 Oct 14 12:47 node-55.st
drwxrwxr-x 4 1000 4294967294  4096 Oct 14 12:51 ..
-rw-rw-r-- 1 1000 4294967294 23443 Oct 14 13:02 global-145.st
-rw-rw-r-- 1 1000 4294967294  6862 Oct 23 00:00 manifest-7568.st
drwxrwxr-x 2 1000 4294967294  4096 Oct 23 00:00 .

root@test-pod-m1:/data/old_m1/nodes/2/_state# cat 2/_state/node-55.st
?▒lstate:)
▒node_idUHgRBrqoTQqq_sUTFHbbYYw▒▒(▒▒U▒ 

=> @startup on the old storage somehow the cluster was aware of which of the 3 folders to choose with one of the 3 UUIDs for each elastic-master node BUT on the new storage after I have copied the data the cluster was not aware what folder to choose and for example in the previous reply he was choosing folder 0 on each of the 3 elastic-master nodes => the UUID would be "AfaPrC0YSVSTPsN7CM5pOQ" on all 3 of them...

Workaround: I have deleted all the folders from the data path , created a folder 0 on each node and added the data for each specific UUID on each specific node:

[root@elasticsearch-master-0 data]# cd nodes/
[root@elasticsearch-master-0 nodes]# ls -latr
total 12
drwxrwsr-x 5 root elasticsearch 4096 Oct 25 09:04 ..
drwxrwsr-x 3 root elasticsearch 4096 Oct 25 11:29 .
drwxrwsr-x 4 root elasticsearch 4096 Oct 25 11:36 0
[root@elasticsearch-master-0 nodes]# cat 0/_state/node-292.st
?▒lstate:)
▒node_idUAfaPrC0YSVSTPsN7CM5pOQ▒▒(▒▒▒+▒


[root@elasticsearch-master-1 data]# cd nodes/
[root@elasticsearch-master-1 nodes]# ls -latr
total 12
drwxrwsr-x 5 root elasticsearch 4096 Oct 25 09:04 ..
drwxrwsr-x 3 root elasticsearch 4096 Oct 25 11:30 .
drwxrwsr-x 4 root elasticsearch 4096 Oct 25 11:36 0
[root@elasticsearch-master-1 nodes]# cat 0/_state/node-120.st
?▒lstate:)
▒node_idUmOXnX6QwQDOeRDqsf4b3dg▒▒(▒▒▒▒


[root@elasticsearch-master-2 data]# cd nodes/
[root@elasticsearch-master-2 nodes]# ls -latr
total 12
drwxrwsr-x 5 root elasticsearch 4096 Oct 25 09:04 ..
drwxrwsr-x 3 root elasticsearch 4096 Oct 25 11:30 .
drwxrwsr-x 4 root elasticsearch 4096 Oct 25 11:36 0
[root@elasticsearch-master-2 nodes]# cat 0/_state/node-56.st
?▒lstate:)
▒node_idUHgRBrqoTQqq_sUTFHbbYYw▒▒(▒▒U▒

Re-deployed the cluster, election was successful and readiness/liveness probes OK after a total of 5 mins - totally acceptable for us

Let me know if it is ok from your side to let this cluster setup in use with the mentioned workarounds.

Thank you,
Gergely Zoltan

:grimacing: No, I don't really recommend doing anything like this. You should be starting up nodes with empty data directories and restoring data from snapshots. You shouldn't be cloning nodes like this. There's a real risk that these cloned nodes might discover each other at some point and this will result in serious data loss.

You should also not be using the deprecated setting nodes.max_local_storage_nodes which is a further source of confusion here. It's much simpler to give each node its own data path instead.

1 Like

Hi @DavidTurner,

Thanks for the assistance.
In our case the old data(cluster) don`t exist anymore so it is safe to run this as it is.

Created another topic for ELK Resource requirements:


Can you please take a look on it when you have some time?

Thank you,
Zoltan