Elasticsearch not starting on Second Run... Voting Configuration and Quorum Issue

oriondark · July 8, 2019, 7:14pm

Startup error: (not on initial run of new elasticsearch instance)

"org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: cannot start with [discovery.type] set to [single-node] when local node {878bc9e4fbfa}{fwjpwdIvTAet6GrkBXymjg}{dTcQaACOQcO7GmOJds2Gkw}{172.18.0.2}{172.18.0.2:9300}{ml.machine_memory=8341348352, xpack.installed=true, ml.max_open_jobs=20} does not have quorum in voting configuration VotingConfiguration{AICGKQZ1TZux3KwYt0YhfA}..

---.yml rollup (elk stack flavor from GitHub )

elasticsearch:
build:
context: elasticsearch/
args:
ELK_VERSION: $ELK_VERSION
volumes:
- ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro
- ./data:/usr/share/elasticsearch/data
ports:
- "9200:9200"
- "9300:9300"
environment:
ES_JAVA_OPTS: "-Xmx6g -Xms2g"
ELASTIC_PASSWORD: changeme
ulimits:
memlock:
soft: -1
hard: -1
networks:
- elk

---elasticsearch.yml
cluster.name: "docker-cluster"
network.host: 0.0.0.0
node.ml: false
discovery.type: single-node
xpack.license.self_generated.type: basic

So, first instance always runs. Once I get data pushed into it, and attempt to restart, it fails.
I've tried a variety of things with the config, but to no avail. I'm assuming its something to do with the bootstrap checks and/or the existence of ml stuff (possibly from Xpack)?

Any ideas on this?

My main goal is to get a dev instance going that I can load up, destroy, test, and generally push around.

Thanks so much!

DavidTurner · July 8, 2019, 9:29pm

The issue is that this node has unique ID fwjpwdIvTAet6GrkBXymjg but previously it belonged to a cluster with a node with ID AICGKQZ1TZux3KwYt0YhfA and this latter node is known to have the freshest cluster state. Either there were multiple nodes in the cluster at one time, or else the contents of the data path are not persisting across restarts. But that'd be pretty weird because it's not that none of the contents of the data path is persisting across restarts either: some of the data is surviving, but not all of it. It's vitally important that the whole data path persists across restarts.

It looks like you're using Docker, but I don't recognise the Dockerfile. Where have you got that from? Also please don't use the password changeme.

oriondark · July 9, 2019, 2:17pm

David,

Thanks so much for the reply.
I got the elk stack images for docker from https://github.com/deviantony/docker-elk.
There should never have been more than 1 node from my understanding, and the data does persist between restarts.
To complicate things, I am running on Docker Desktop (for Windows) and the images and data are on a USB drive. Smile.
I was able to push everything through on the old 6.x versions with the same GitHub pieces, but keep running into the Voting Quorum thing.
I've tried to dive into the documentation, but don't exactly understand how to either bypass the bootstrap checks (doesn't sound like that's the issue), or, as you are helping me discover, harden (or fix) the node IDs between restarts.

Any further suggestions/help would be great!

Thanks so much for the reply. Love the product.

DavidTurner · July 9, 2019, 2:26pm

Certainly a nonstandard configuration, and a nonstandard Docker config too.

Which of the node IDs is the "right" one? I.e. when you start the node the first time it'll log a line like this:

[2019-07-09T15:14:02,382][INFO ][o.e.n.Node               ] [node-0] node name [node-0], node ID [WvFG5LtFSLiqfmM1IeOPHg], cluster name [elasticsearch]

Here WvFG5LtFSLiqfmM1IeOPHg is my node ID. When you start it up the second time, does the original node ID match the new node ID or does it match the one in the VotingConfiguration (or, even more strangely, does it match neither?)

When you say "the data does persist between restarts" can you be more precise, and share a list of the full paths of the files that are put on persistent storage? When you restart, do you see the last-modified dates of these files change?

oriondark · July 9, 2019, 2:50pm

Sorry...new ids, be trying various things. Here's the log line:

elasticsearch_1 | {"type": "server", "timestamp": "2019-07-09T14:27:56,713+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "2580ec46aa93", "message": "node name [2580ec46aa93], node ID [PV9cpNj7Sxax7lyO3_mzmw], cluster name [docker-cluster]" }

And the accompanying Voting Quorum piece:

elasticsearch_1 | "stacktrace": ["org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: cannot start with [discovery.type] set to [single-node] when local node {2580ec46aa93}{PV9cpNj7Sxax7lyO3_mzmw}{u9dwInaTTK-qdlm2naqXlA}{172.18.0.2}{172.18.0.2:9300}{ml.machine_memory=6227419136, xpack.installed=true, ml.max_open_jobs=20} does not have quorum in voting configuration VotingConfiguration{6yNfx-jYQRm649gF67sa9g}",

Both the u9dw.. identifier and the 6yNf.. identifier do not occur anywhere else in the output.

Data path: /data/nodes/0/[_state][node.lock][indices]/[all indices with uuid-type names/state/0, etc... everything lives in one place.

Appears like 'state' pieces [nodes\0\indices\tW163-2tTDmFfzgOOcJxyQ_state\state-xx.st]have been modified since last good run (when data was importing)
data\nodes\0\indices\KM4GWJjKTfq_6MX9FkmUzw\0\index - contains 20gb of cfe, si, cfs, etc files - modified at last good run.

Did I get everything?

Thanks again.

DavidTurner · July 9, 2019, 3:38pm

Sorry, I will clarify. As I understand it, the first time you start this node up (empty data directory) it works, but then the second time it doesn't. I would like to see the log lines containing the node IDs from both runs, as well as the cannot start with [discovery.type] set to [single-node] line.

The two files to look at in particular are data/nodes/0/_state/global-NNN.st and data/nodes/0/_state/node-NNN.st (for decimal numbers NNN). They should be replaced with higher-numbered files each startup. Are they?

oriondark · July 10, 2019, 2:53pm

Sorry for the delay. Had to travel yesterday.
Anywho,
here are some of the logs: - first three starts (create, kill --restart, index-- restart data) worked... after data, things failed.

First Run
elasticsearch_1 | {"type": "server", "timestamp": "2019-07-10T13:02:22,695+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "16278c81ca88", "message": "node name [16278c81ca88], node ID [8ZqcR-3SS8aldEtzbbGbaQ], cluster name [docker-cluster]" }

Second Run - No Indices, No data
elasticsearch_1 | {"type": "server", "timestamp": "2019-07-10T13:06:41,243+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "16278c81ca88", "message": "node name [16278c81ca88], node ID [8ZqcR-3SS8aldEtzbbGbaQ], cluster name [docker-cluster]" }

Third run - Indices
elasticsearch_1 | {"type": "server", "timestamp": "2019-07-10T13:12:15,907+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "16278c81ca88", "message": "node name [16278c81ca88], node ID [8ZqcR-3SS8aldEtzbbGbaQ], cluster name [docker-cluster]" }

Fourth run -indices, data, and error
elasticsearch_1 | {"type": "server", "timestamp": "2019-07-10T14:40:36,614+0000", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "docker-cluster", "node.name": "16278c81ca88", "message": "node name [16278c81ca88], node ID [_WYf379jRwyIZn95s0hZBw], cluster name [docker-cluster]" }
elasticsearch_1 | "stacktrace": ["org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: cannot start with [discovery.type] set to [single-node] when local node {16278c81ca88}{_WYf379jRwyIZn95s0hZBw}{5qYSXsNXSaSOHp4amJJJqA}{172.18.0.2}{172.18.0.2:9300}{ml.machine_memory=6227410944, xpack.installed=true, ml.max_open_jobs=20} does not have quorum in voting configuration VotingConfiguration{8ZqcR-3SS8aldEtzbbGbaQ}",

global and node files appear to be going up every time (decimal numbers).
I have the full logs of each run. I can do another run and save the _indices piece of each run.
It appears that at some point, after data, the clusterId changes.

Thanks again for all the help. Please let me know if I can get you any more info.

DavidTurner · July 10, 2019, 3:38pm

Huh. I wasn't expecting to see that.

node ID [8ZqcR-3SS8aldEtzbbGbaQ]
...
node ID [_WYf379jRwyIZn95s0hZBw]

The node ID is changing. This is stored in data/nodes/0/_state/node-NNN.st, and we only ever create a new node ID if the file isn't there. It's a binary file, but it's tiny and contains the node ID in plain text so you should be able to see it in a text editor. The only explanation I can think of is that something is removing this file between runs.

system · August 7, 2019, 3:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cannot setup single node elasticsearch in docker Elasticsearch	11	4321	July 16, 2019
Elasticsearch not starting - failed to notify ClusterStateListener unable to parse step for policy Elasticsearch	5	2737	August 30, 2019
Development cluster bootstrap error Elasticsearch	7	395	September 2, 2019
Elasticsearch service won't start on Ubuntu 22.04 after a reboot Elasticsearch	1	64	October 24, 2024
Node Discovery Issue Elasticsearch	7	435	March 10, 2020

Elasticsearch not starting on Second Run... Voting Configuration and Quorum Issue

Related topics