Cluster hardware setup

Currently in a proyect we have 4 nodes, eachone with this specs:

  • 10TB of HD
  • 1TB of RAM
  • 20 Cores of CPU

I would like to confirm that:

Question 1: It is a good idea to buy 6 Elasticsearch Platinum licenses for this nodes? Note: As i understand 1 license per node, will be let 2 licenses without use.

Question 2: It is a good idea the following configuration?

node1, node2, and node3:
master: true
data: true
ingest: true
ml: true

node4:
master: false
data: true
ingest: true
ml: true

heap size max (all nodes): 30 GB

node 1 and 3: rack_1 , hot
node 2 and 4: rack_2 , warm

Question 3: I must avoid running multiple instances of Elasticsearch in same physical node, right?

Q1: you don't have the choice. Each node of the cluster will count in the amount of nodes you pay for.

Q2 and Q3: as you have so much RAM on each physical machine, I'd run:

  • 3 dedicated master nodes
  • 8 data only nodes (2 per machine)

What kind of use case are you going to run?

But as your intention is to buy a license, I'd definitely ask an elastic solution architect to help you designing it right according to your use case. May ECE would make sense here..

Thanks! I'm really curious about this:

Q1: Should I run 3 instances of elasticsearch as dedicated master node on each physcal machine?

Q2: Should I run 2 instances of elasticsearch as dedicated data node on each physcal machine?

I took the training on Elasticsearch and got the certification, and as I understand It is not recommended to run multiple instances of Elastic in the same physical node.

Q3: Would be a good idea to virtualize each physical node? The target will be use as much RAM as possible.

What kind of use case are you going to run?

Indexing and querying government documents with hyperconvergence

The client bought 6 Elasticsearch Platinum Onpremise licenses without asking me :sweat_smile:

Should I run 3 instances of elasticsearch as dedicated master node on each physcal machine?

One instance per node.

Should I run 2 instances of elasticsearch as dedicated data node on each physcal machine?

Yes.

That's true as normally you have machines with at most 64gb of RAM. But here you have so many RAM available that I'd probably use it.
If you do run multiple instances on the same machine, you need to be careful with shard allocation and make sure that primaries and replicas are not allocated on the same physical machine. That's something covered by the training as well (and that's very good that you attended the trainings!)

Would be a good idea to virtualize each physical node? The target will be use as much RAM as possible.

I honestly don't know what would be the advantage.

So mostly a search use case than index use case. I mean it's not about IOT and logs...

That's fine then. I guess they have been in touch with solution architects.

As you have 6 licences, I'd do then:

  • 1 instances of elasticsearch as dedicated master node on each physical machine
  • 1 instances of elasticsearch as dedicated data node on each physical machine

Which makes 6 nodes in total.

HTH

1 Like

Thanks.

Sure, I'll use shard allocation awareness in the cluster settings. Also use the rack attribute.

I'm thinking about use docker to run the cluster. It's a good idea? Currently I just running using linux tar.gz files

True only search and index use case.

Q1: It's a good idea I run a logstash instance on each physical node? To make the ingest and processing the data.

As I understand:

( 1 master + 1 data instance ) * 4 physical nodes = 8 instances

I'm correct?

Thanks for your help! :pray: