I've been researching this use case for a couple of days and it looks like there is about a thread per year on it. It also looks like the answers have changed over time. I believe I have a valid use case for this, but I'm struggling with implementation and I'm hoping you all can help me.
I have several machines with 24 cores/192G ram/10G nic. I believe this gives me capacity for up to 3 nodes per bare metal server.
To accomplish that, I've seen options to containerize, use a hypervisor, use multiple sessions (screen), -Des.config (seems deprecated), and creating individual services with unique yml files. I think my preferred option would be that last one of creating individual services. I just don't know how to go about it.
Is anyone aware of any kind of guide that's been published for this? It sounds like a lot of people have done this and hopefully, for my sake, I'm not the only one out here struggling with this. Any help is appreciated!
I currently do this...
You need to specify a separate service file for each instance.
You need a separate config for each instance... Mine are /etc/elasticsearch/{instance_name}/*
You will want separate log files for each... Mine are /var/log/elasticsearch/{instance_name}/elasticsearch.log
You will need to track pids for each instance separately
You will need to specify different lock files for each instance
You will need to have separate data paths for each instance. (separate disk is better)
There are a lot of issues I have had to solve in order to be able to accomplish this task, but overall, it has been worth it on cost per instance.
Appreciate both of those answers. So, realistically, I'd be looking to make the bare metal servers specifically data nodes. Then, I'd be spinning up vm's to serve as dedicated masters, ingest, coordinates, kibana, etc.
@warkolm Would you have concern about the container approach given the different types of node deployments? Would elastic not care because they're nodes? Could make it more difficult to manage?
@lzukel It sounds like you went through some pain to get your solution to work. Would you be comfortable sharing some of the pitfalls?
Frankly I'd use Docker and not bother with any other route - it'll be 10x simpler & easier. Use composer to manage the nodes so you can up/down them as you wish, limit CPU, etc. RAM a bit harder but you have heap control in ES so not a big deal. Even networking simple & taken care of.
You can try both, of course - last time I setup a half-dozen nodes via Docker it took a few minutes total time (mostly creating the single composer file). I can't imagine managing the multi-instance method, though of course it's possible
Appreciate all of the replies. I've decided to go down the docker path first. To proof it out, I'm testing it on a 4 node test cluster (on 4 vm servers) with a dedicated master (will be more robust after poc). I'm trying to spin up a single data node in docker and join it to the existing cluster, but not having any success. With that said, I can spin up a multi-node docker-exclusive cluster without any problem. Has anyone tried mixing and matching and had success?
Depends on exact issue, but often with Docker, the internal container IP and the external IP (or port) are not the same - so the outside nodes can't get the right IP/port list - all nodes in a cluster must be able to reach & talk to each other on port 9300.
Usually you need to set the network.publish_host and maybe port BUT then the containers must be able to see this, too.
If you have an existing non-docker cluster and have a docker node you want to join that, then your docker node must export port 9300 to an IP the others can see and set network.publish_host in the container node to that IP.
The overall reason is when a node joins, it reports its IP/Port to the master, then other nodes use THAT IP/Port to talk to the new node. But in Docker, the node's container IP is usually not the IP the other VMs can reach; this depends on the docker networking mode you are using, but often you'll see something like a VM with an IP like 10.1.2.3 but containers on a 192.x network that's only on that VM; but they'll report the 192 address to the master, which is not routable; so set network.publish_host to the 10.x address and it all works.
Thanks Steve. It must be an issue with my networking setup. Even when I specify network.publish_host, it's still publishing to a different address than the one I specify. To be clear, I'm following this guide for a multi-node docker cluster:
Two key parts of that are where it sets the network specific to the service as "elastic" and it defines that elastic network as bridged. Assuming my issue is with that bridged network?
You cannot fully use that procedure as the nodes will all talk on the Docker internal network, unreachable from outside (except the port 9200 for REST calls) - notice how NONE of them are exporting 9300 to the LAN that other non-docker nodes need.
Key things are:
All nodes can talk to each other; you can test by telnet to their node IP and port 9300 (or whatever) - note in Docker they may be on various ports as in bridge they can't all export to 9300 so the network port has to be set also so they give the master the right port. This makes things messy.
The cluster needs to know these 'public' reachable IP and ports; you can see this in the /_cat/node list (might have to add a column; I forget).
Basic way is straight forward, in theory (warning; I have not tested this in a while, depends on docker a bit), but lots of small parts to get right (I need to write a blog on this):
For Docker nodes, change the compose file to expose their ports for 9300 on the Docker host, like 9301, 9302, 9303 for three nodes. Do this on each container like:
ports:
- 9301:9300
For all nodes, set network.host to 0.0.0.0 so they listen on all interfaces (in case there are multiple).
For all Docker nodes, set network.publish_host to the docker host VM IP as this is how both Docker nodes & non-Docker nodes will talk to the Docker Nodes. This will be the same for all nodes.
For all Docker nodes, set transport.publish_port to the exposed port on the Docker host, so if you expose 9301, 9302, 9303 for three nodes, you set that for each container. This will be different for all Docker Nodes as they share the same VM IP so they cannot share the same port
Bring up this cluster with Docker nodes only, and see if it works. The key thing is each node uses the VM IP/Mapped port (9301, 9302, etc.) to talk to other hosts.
If that works, you can your non-Docker nodes to this - use their normal settings, and the discovery IP/port will be one or more of the above published IP and port (you must include the port or else your non-docker node will try to connect on port 9300, but). So like: discovery.seed_hosts=[10.1.2.3:9301]
Let me know if that makes sense; sorry if it's a bit jumbled - make sure you understand what these settings do and let us know how it goes. I promise a tested blog on this soon, i.e. this month.
Bridge networks apply to containers running on the same Docker daemon host. For communication among containers running on different Docker daemon hosts, you can either manage routing at the OS level, or you can use an overlay network.
I recommend instead using a Docker networking mode in which the nodes can see each other properly. Usually I'd say to start with host networking because it's the simplest. You can upgrade to a more complex network later, but IME it's tricky to set up multi-host Docker networking and an Elasticsearch cluster at the same time as if it doesn't work it's hard to tell where the problem lies.
The main drawback of host networking is that you must manage clashing ports yourself if you have multiple nodes on the same host. By default Elasticsearch automatically does approximately the right thing here, scanning for a free port, but you'll save some future hassle by nailing down at least the transport port of the master-eligible nodes.
Strong disagree We frequently field questions on these forums that boil down to "docker bridge networking doesn't work across hosts".
Strong disagree We frequently field questions on these forums that boil down to "docker bridge networking doesn't work across hosts".
Thanks and agreed a hack & not that simple for folks; was a bit facetious (at least it's not k8s), but also was simplest I could think of for this scenario - host networking is Linux-only & my worry was most folks will just use defaults for networking and get confused without composer ports, etc. Host networking still requires setting the http/transport ports, as you note, but still simpler than publish IP & ports
But agreed probably better to use it since then ports are obvious and reported correctly, with nothing Docker-specific, so easy for folks coming from VMs.
Sorry for over-complicating it. Seems blog with clear examples is in order
Hi Everyone, finally getting a chance to reply after testing this out in my spare time, but still missing something. I have been able to get it to run on the host network, but still unable to discover the rest of the cluster. Here is the error I get when I try to start it:
fox-elk02-2-cont | {"type": "server", "timestamp": "2020-09-18T18:44:37,300Z", "level": "WARN", "component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "clusterfox", "node.name": "fox-elk02-2", "message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered [{fox-elk02-2}{NgQBUIwFTx65kUVCrYrxyg}{q-N1ja0CRC-2pJKGQG0qjA}{192.168.1.91}{192.168.1.91:9301}{dilmrt}{ml.machine_memory=6087008256, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}]; discovery will continue using [192.168.1.90:9300, 192.168.1.91:9300, 192.168.1.92:9300, 192.168.1.93:9300] from hosts providers and [{fox-elk02-2}{NgQBUIwFTx65kUVCrYrxyg}{q-N1ja0CRC-2pJKGQG0qjA}{192.168.1.91}{192.168.1.91:9301}{dilmrt}{ml.machine_memory=6087008256, xpack.installed=true, transform.node=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
Also, I can see its listening on 9201 even though I don't have 9201 specified as the http port in my yml. Is that weird or expected?
It'll do that by default if another process is already listening on port 9200. When there are multiple nodes on the same interface I recommend explicitly setting transport.port and http.port on every node, so that it fails at startup rather than scanning for the next free port.
Also, don't set network.publish_host and network.publish_port, they are unnecessary and will just add to the confusion.
Also also if your master nodes are running on a transport port other than 9300 then you will need to specify that port in the discovery config: discovery.seed_hosts=...,192.168.1.91:9301,...
The main problem, however, is described in the log message:
This setting is required the first time the cluster starts. See these docs for more info.
Thanks David. This has helped. I initially had initial_master_nodes specified, so, I uncommented and brought the container back up. Same result. However, this time I looked at the logs on the master node and see this:
[2020-09-19T09:26:03,302][WARN ][o.e.x.c.s.t.n.SecurityNetty4Transport] [fox-elk01] received plaintext traffic on an encrypted channel, closing connection Netty4TcpChannel{localAddress=/192.168.1.90:9300, remoteAddress=/192.168.1.91:55344}
That leads me to believe the xpack parms weren't being taken into account. I'm starting to modify the parms specific to certificates, but getting access denied on the path. Going through an exercise to better understand volume config and I'll reply back with what I find. Hopefully, it's a post with good news. Appreciate everyone's help!
Alright, gang. I was able to get the bare metal/docker combo node option working, but only only if I set it up without TLS. As soon as I try to mount a local driver to the container and reference the certs, the compose execution failed with an "access denied" on the cert folder specified. All of the guides that talk about enabling TLS on an elastic docker cluster create the certs in a new volume. My question is, how do I get my new docker node to utilize certs that I've already created for my bare metal nodes? Here is where my yaml sits today. Appreciate all the help!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.