Optimal node roles for 100-node cluster

A cluster of 100 nodes was installed. We want it to serve as a search engine like Google.
Servers are located in a data center.
Raid-0 structure was preferred for the high-speed requirement.
Query traffic will be more intense than indexing traffic.
The query will be made with queries containing boolean, wildcard, fuzzy, transposition, function score.
The data in the index is updated periodically. (like once every 1-2 weeks)
There will be 1 replica for each index

Node Hardwares
128GB ram
64 CPU
8.8TB SSD

Node roles
3 Masters
3 Coordinators + ingest
94 Data

As recommended in best practice, a separate cluster will be set up for stack monitoring and the data will be sent to that cluster with metricbeat.
Ingest pipeline (including stack monitoring) is not actively used. We preferred coordinator + ingest in case we need it in the future.

Question 1
Does the coordinator + ingest node role structure work as a full performance coordinator node when the ingest node is not actively used?

Question-2
Would you recommend putting a load-balancer in front of 3 coordinator nodes during indexing or querying?

Question-3
There are index sizes up to 60 TB. When calculated according to best practice:
``Aim for shard sizes between 10GB and 50GB```
index => 60 TB => shard count => must be between 6000 - 1200
Is it ok to use 1500 shards for an index in a system with 100 nodes?
Note: _id-based indexing is done. The index is constantly being updated, it could not be written to multi indexes to avoid duplicate records.

Question-4
We are considering using a (Turkish) dictionary stemmer for natural language processing, but we have performance concerns.
Do you have any suggestions?

Regards,
Musab

100 nodes is pretty large FWIW. We don't really see clusters this large these days, due to the use of things like CCS and CCR.

Yes.

If you can, yes.

You should really test this with your data, your queries, your SLAs.

Same as previous question, you need to test this yourself.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.