Theoretical IMPORTANT questions about architecture of elasticsearch

Hello, we are developing a searching engine with elasticsearch and already have 3 nodes on dedicated servers for each.
We almost have read the documentation, but faced to a couple of problems.

about our servers:
we have 3 servers with the following characteristics: 64 RAM, 16 CPU, 2000 GB nvme.

we have two main indices. that WILL contain(in the future) billion of documents(now service not started yet). 
so, in three nodes, we have 3 primary shards and 2 replicas for each index. (each node will contain one primary and one replica)

design of nodes: nodes — ImgBB
So, my questions is:

as written on the documentation, each index contains pieces of segments(in our case, one index contains 3 primaries, 2 replica shards).
but, on docs, you recommended to use 20GB - 40 GB memory for each shard.

if each of our shards will contain 30GB of memory, then, 3 primaries will take 90GB + 2 replica (60),
total, we will use for two indices, 300GB of memory.

the first question is:
can we add primary 4 to one of the existing node. (for example to the node-1), then store its replica to another node.
and, if we will add one more shard, then will we add to the existing nodes, (If it's possible, I would like to add 3 primary and 2 replicas shard for each node)

or, if I'm wrong, can you suggest me, what should we do, and how we should allocate our shards between three nodes that we can easily expand in the future.

second question:
we can not change number of primary shards, after creating the index, but can change number of replicas. so, if we want to expand, should we increment number of replicas only, or we should reindex?.

I have also a few questions, but it depends on the answers to the questions above.

thank you!

Welcome to our community! :smiley:

You cannot tell Elasticsearch to store specific primary or replica shards to a given node.

You can change the number of primary shards - Split index API | Elasticsearch Reference [7.10] | Elastic

1 Like

wow, very useful information, thank you @warkolm . In case we want to add one more shard we can use _split,
but what about the number of shards that can contain one node, I've just tried to create 8 primaries and 3 replicas to the 4 nodes and succeed. but, if I add one more node, it will relocate shards, so, should I increase the number of shards in that case?

as I got, the method _split method is analog of _reindex, but with all documents. so it will create a new index. but, if we increase only the number of replicas, is it necessary to _slpit or _reindex?
btw, the _split method closes write ability to the source index, thus that we can not write data into this index (can we delete it ?)

and, is it normal to allocate more number shards with a small amount of memory(10GB), or less but more amount of memory(for ex. 100-200GB) ??? (PS. all this operation will be within three nodes)

thank you~~))

Only if you have 1 shard to start, yes.

Why? You could increase the replica set again I guess.

No, see Split index API | Elasticsearch Reference [7.10] | Elastic.

You can just increase the replica count without splitting or reindexing.

Yes but you can revert that once the split is done.

I don't understand this one sorry.

1 Like

The size of a shard on disk does not directly correspond to the amount of memory required as all data is not held in memory. I have seen nodes having 64GB RAM and 30GB heap hold many terabytes of data on disk. The less data you hold on disk, the more of it will be cached in the Operating system page cache, which will improve query speed and reduce disk I/O. The optimal amount of data per node therefore depend on the use case and whether it is index or query heavy.

Elasticsearch will automatically distribute the existing shards across the available nodes and there is no requirement that the number of primary or replica shards exactly correspond to the number of data nodes.

This depends on the use case and why you are expanding. The number of replicas is typically increased in order to be able to handle more concurrent queries. If you on the other hand are scaling out to hold more data or index more you would increase the number of indices or number of primary shards per index.

1 Like

@Christian_Dahlqvist
so, if each our shard will reach 20GB, should we add more shards to improve the performance of elasticsearch ? (on docs, recommended to have data within 20Gb - 40GB in each shard).
I think, adding a replica will not change the size of the (p)shards. and if data will continue rising, each shard can reach up to 100GB, and will it negatively affect to the elastic performance ?

ok, first let's identify the use case. you are saying it's a use case and depends on my own use of elastic, we are using it for searching data from DB, also we will get a single and a group of the document with _mget documents(in the further future, will add ML nodes).
in our case, is it necessary to do all as you recommended, I mean, having only 20-40GB of data in each shard?

but we can not control the disk I/O or we can not say to elastic, use the only RAM. anyway, in the future, RAM will not be enough, because if we would have 200GB of memory in three nodes, the RAM memory would not be enough, therefore, there are other things that will require RAM.
PS. I've started elastic less than a month ago, and not able to understand all about elastic)
I'd like to get suggestions :))
thank you for your responses ! :blush: :blush: :blush:

The optimal shard size will also depend on the use case, e.g. the nature of the data and how and how often you query/update/add to it. To determine this you will need to benchmark with realistic data and queries. It is important to know that each query will execute against each shard in a single thread, so the shard size will affect query latency. If you query multiple shards, those will be queried in parallel as long as there are threads available. Querying lots of small shards might mean that each shard takes less time to query but you may on the other hand have so many that the tasks need to be queued up and run sequentially to some extent.

When it comes to how much data each node can hold and how much RAM/heap this will require you also need to benchmark. If you want good performance and have more data than can fit in memory it is important that you get as fast storage as you can as this often quickly becomes the bottleneck.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.