Better to use 3 master+data nodes or 3 master + 2 data nodes?


We're running a search engine with two ES clusters. The first "content store" is for the crawled data, logs, stats, and metrics; so it has a heavy write/read and aggregate load. The second "public search" cluster is for the public search indices per language that has a much higher read load than write load.

We're currently using a single node AWS hosted cluster for the content store and Elastic Cloud with 3 nodes for the public search cluster. The problem is that our disk space needs keep growing, yet we can't afford to upgrade to the higher elastic cloud mem + disk size constantly, so we're looking to move to our own clusters hosted on AWS ES service.

Now I'm wondering if it'll be better to run 3 master+data m4.large nodes with 512gb disk each now, and switch to dedicated masters later or just keep adding more master+data nodes? Or do we start with 3 dedicated masters on t2.micros and just have 2 m4.large data nodes till we can afford to add more data nodes?

For the public search cluster we need high redundancy, and with zone awareness we'd need 2 or 4 data nodes to start, so it's probably best to start with dedicated masters there. For the content store we need a lot more disk space, and it can go offline for hours without problems as we're using queues, so it's probably more cost-effective to use 3 master+data nodes there.

Thanks for any advice!


Hi Vaugnd,

What is the average size of the index in the two cluster ?
What is the amount of shards/replicas ?

Other point:

One of the principles of elasticsearch is the eslasticity and scalability, however, depending on the amount of data it is better to have dedicated nodes for each function.
The master nodes are used only for cluster management, ie the amount of memory used is not the same as the use of a data node. The recommendation at first without knowing the size of its index and amount of shards, is to give different work for each node.

I understand that may find a cheaper mem-to-disk ratio by building your own cluster in EC2 with EBS, but keep in mind that you will need enough memory for filesystem cache if you really want high read throughput as EBS can be a performance killer (you could try improving it with IOPS, but depending on the disk size and the load required it's cost will become prohibitively high quite fast).

You should start with m4.2xlarge as it is the minimum recommended for any production level cluster. Depending on the data size you could try with maybe m4.xlarge, but avoid m4.large since it only has 2 cores (with such a low count performance will be really bad). Also, m4.large as master+data will be really bad for cluster stability because as soon as a heavy queries start hitting the cluster there is a high chance that the whole cluster will becomes unavailable.

Lastly, avoid t2 instances, even more for master nodes, as t2's CPU credit system is generally not suitable for Elasticsearch.

These are all general recommendations and what you really should do is test and benchmark. One great tool that will help you with that is rally.

1 Like

Content store:
Our largest index is 577gb 21mil docs: content_store
The logs, metrics, etc. indices are all averaging 30gb.

Search indices:
english search index: 170gb 11mil docs
dutch search index: 85gb 4.8mil docs

5 shard default on both. Only search indices cluster has a single replica.

Thanks for the advice. My CEO won't like the costs involved :smiley:

So we're probably looking at 3 m4.large master nodes and 2 m4.xlarge data nodes for each of the 3 clusters. Then we might as well use the AWS ES service instead of our own opswork setup.

HI @vaughnd
Nice One recommendation is to keep shards up to 50gb in size for ease of reallocation. At its highest index "577GB" and 5 shards we will have pieces of 115GB. If the shards' reallocation time is not a problem for you, it can be kept in 5 shards, otherwise it will only increase the amount.

It would be legal to distribute masters between 3 regions. As the master does not use as much memory as the data node you can use a smaller machine but avoiding t2 as directed by @thiago .

Remembering that it is important to attach specific disks to the storage of ES shards, so within the EBS maintenance if maintenance is needed becomes easier.

Will your indexes need to be kept in ES for how long? If they are never deleted, the trend is for them to grow. The ideal would be to configure snapshot indexes and send them to s3 and after that process deleted.

The definition of retention time and index mapping (daily / weekly / monthly) is important at these times. Once the snapshot is finished, it is possible to perform the restore when necessary, thus saving some disk space, leaving in the ES only the indexes that are actually used.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.