Elastic machine learning on a different node but same server

Waxman · February 15, 2021, 12:57pm

Hi, I'm thinking about running two containers:
elasticsearch master/data/ingest/ node -> container A
elasticsearch machine learning -> container B
but they must run on the same server.

I think there might be two approaches:

putting elastic machine learning service on a different port e.g. 9201
merging into one container and running as master/data/ingest/machine learning node.

Regarding point 1, I have no idea how to point the different binaries, the default is:
/usr/share/elasticsearch.
May I use:
/usr/share/elasticsearch1/
/usr/share/elasticsearch2/ ?

Which approach will be better and why ?

Best regards,
W.

Julien · February 15, 2021, 10:37pm

I think this question is docker specific and there seems to be a misunderstanding of how to use containers, point 1 sounds correct but the question on binaries is not correct. The docker image contain the containers and the docker image should not be modified
You should simply use two docker containers and can map to a different local port on the docker host (if you even need to map each node, there is no real reason for the ML node, that node does not normally need to be accessible from the docker host).
Modifying the docker image to run two instances of elasticsearch would be unsupported and does not make much sense from a docker standpoint because a container is normally expected to run one process (with PID 1), if that process with PID 1 ends the container terminates

For volume (whether you use named or bind volumes), each docker container has one volume mapping to container directory /usr/share/elasticsearch/data like in our documentation

And of course, this is not elasticsearch specific, any container should also be limited in vCPUs and Memory so there is no over-allocation of resources if you run multiple containers on a docker host (to avoid noisy neighbour issues)

Thanks

Waxman · February 16, 2021, 11:05am

@Julien
So the best and the simplest solution will be make this elastic node as: master/ingest/data/machine learning node and use port 9200 for all of this roles. The only thing is to put lines in the elasticsearch.yml as follows:

node.master: true
node.data: true
node.ingest: true
node.ml: true
xpack.ml.enabled: true

Is that correct ?

Julien · February 17, 2021, 10:02pm

I am not sure which version that question is for and best is to check the doc for the version you use, but generally node.ml and xpack.ml.enabled settings both default to true just like the other settings you mentioned (so you could omit all these settings). It you want to run all roles from the node, then yes those settings in elasticsearch.yml passed to the container via environment variables are correct (if you want to separate the roles to have one ml node, you should disable ml for the master-data node and disable all the other roles for the ml node)
You can check with GET _cat/nodes?v to see which roles the node is using (doc for latest version)

Waxman · February 18, 2021, 9:10am

@Julien
Thanks man. Is there any benefit from running machine learning as standalone node ?

Julien · February 22, 2021, 3:03pm

The main reason is scalability and high availability. ML and data nodes both use a lot of CPU and memory (ML runs outside the JVM Heap)... So when running everything in the same node, it can lead to performance issue (example ML job making data node slower for ingestion or search when ML uses a lot of CPU)

system · March 22, 2021, 3:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Question on implementing elasticsearch in docker way Elasticsearch	3	952	November 16, 2017
How to run multiple client nodes on one host Elasticsearch	9	3558	July 5, 2017
Multiple Nodes Single Machine Elasticsearch	16	3276	October 31, 2020
Multi elasticsearch docker containers per host? Elasticsearch	3	5468	July 5, 2017
ELK PROD Cluster - Running multiple data docker containers on same machine. Docker Swarm Elasticsearch docker	1	132	April 22, 2024

Elastic machine learning on a different node but same server

Related topics