Question on implementing elasticsearch in docker way


(Tay Eng Soon) #1

Currently I have a cluster with 5 nodes...And master nodes have the same hardware specs as data node, which is 16 core CPU and 32GB memory.

Since master node didn't utilize that many CPU and RAM, so my manager argues that there is a waste of resources and asked me to instead of setting it as dedicated master node, I should just make it a hybrid node by setup two docker on the node, with one dedicated master docker and one dedicated data docker on the same node.

Questions:

  1. Is this recommended? Because in the guide is recommend to separate the tasks and make it dedicated right?

  2. Normally you can find the log in /var/log/elasticsearch, but I can't find them if I run elasticsearch in docker...Where can I find them?

  3. If you install elasticsearch as a service, you can update by apt update, but how about docker? Can you update as well?

  4. Actually all the nodes are VMs, so if I install docker, would it be counter intuitive?

  5. Is it better to combine few data nodes docker inside one single server, or separate them as one dedicated nodes? If dedicated, is better to install ES as a service right?

Thank you.


#2

Is this recommended? Because in the guide is recommend to separate the tasks and make it dedicated right?

Master nodes need not have that many CPUs and memory. We have master nodes with 2 vcpu and 8gb RAM (heap set is 2gb). BTW, how many master nodes do you have? If it is just one and if this node goes down your cluster will have no master. Recommended is to have 3 master eligible nodes and then set the following property. This will also take care of the split brain scenario. https://www.elastic.co/guide/en/elasticsearch/guide/2.x/important-configuration-changes.html#_minimum_master_nodes. It is good to have different machines for different type of ES nodes... It is good for scalability and high availability of your cluster.

discovery.zen.minimum_master_nodes: 2

Normally you can find the log in /var/log/elasticsearch, but I can't find them if I run elasticsearch in docker...Where can I find them?

When you start dockerized elasticsearch then the logs are written inside the docker container. They are still at /var/log/elasticsearch. To see the log either you need to login to the docker container or you can expose those internal folder to your host when starting the docker container. Read more at https://docs.docker.com/engine/admin/volumes/volumes/

for example:

docker run -v /path/on/my/host:/var/log/elasticsearch ${any other necessary docker params like ports} ${es_docker_image}

If you install elasticsearch as a service, you can update by apt update, but how about docker? Can you update as well?

read more about elasticsearch docker at https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html and general upgrade guidelines at https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-upgrade.html

Actually all the nodes are VMs, so if I install docker, would it be counter intuitive?

Generally, you should have one type of elasticsearch node per VM. For example, you can have a dedicated VM for running data node docker, client node docker or master node docker. The size of the VM can vary. Master node VM can be 2 vcpu and 4gb RAM machines... data and client should be larger (depending on your use case). For example, we use 8vcpu and 32gb RAM machines.

Is it better to combine few data nodes docker inside one single server, or separate them as one dedicated nodes?

Hosting multiple VMs on the same host machine is risky, IMO.. For example, say, you have 3 data nodes with 1 primary shard and 2 replica shards. Each is running in its own VM... but all VMs are hosted on the same blade. If that blade crashes you will lose all the data. Running multiple ES instances inside the same docker container will have similar risks and may be performance penalty too. Thus its recommended that you should run each data node on separate VM and each of those VM should be on different blade. Sometimes, we even go as far as ensuring that the blade itself is on a entirely different rack.

If dedicated, is better to install ES as a service right?

In general, docker should be run as a service. This will ensure that if that sever had to restart after planned maintenance or otherwise then the docker daemon will be started too and thus the underlying containers too will get started. Otherwise you will have to manually manage the startup...

Hope the above helps.


(Tay Eng Soon) #3

Thanks for the detailed and informative reply!

For now I will make your reply as answer!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.