About shards, nodes and cores

lfv89 · November 17, 2018, 6:55pm

I've been digging into Elastic's mechanism of shard distribution and I'd like to validate my understanding to see if I got something wrong or if there is something missing.

With the aim of achieving a higher degree of parallelism inside a cluster, while still avoiding resource competition among the shards, would the statement below be correct?

In a cluster with 3 indexes, each index has 5 primary shards, the minimum number of cores that this cluster should have is 3 * 5 = 15 cores, regardless if that would lead to an architecture where the cluster would have a single node with 15 cores or 15 nodes with 1 core on each node.

warkolm · November 18, 2018, 2:09am

That really depends on the work load.

If you have a very heavy search use case then it makes sense.

lfv89 · November 18, 2018, 2:19pm

I see, thanks. And in this case the number of replica and data shards wouldn't impact the ideal number of cores, correct? When deciding the number of cores in a cluster, the only type of shards that should be taken into account is really the primary ones, right?

warkolm · November 18, 2018, 7:37pm

Again, it depends.

What sort of workload do you have?

lfv89 · November 18, 2018, 9:52pm

Sadly I don't have the exact numbers right now, but I can request and have them by tomorrow.

Which ones would you suggest me to bring?

warkolm · November 18, 2018, 11:44pm

I'm really just trying to get information to understand why you think this is so important.

Christian_Dahlqvist · November 19, 2018, 6:20am

The ideal number of shards depends a lot on the workload and latency requirements, and the number of primary and replica shards both play a part here. It also depends on the number of concurrent queries you are optimising for.

lfv89 · November 19, 2018, 1:11pm

Hey Christian, thanks.

I'm less concerned about the ideal number of shards and more about the relationship between the total number of shards in my cluster and the total number of cores that the cluster should have, in order to accommodate every shard (optimized for parallelism) without burning money.

Let's say that for a fact my one-indexed cluster needs exactly 15 primary shards. Does that mean that I need exactly 15 cores to achieve the sweet spot, or in this calculation I should also taken into account other factors such as, for instance, the number replica shards?

Christian_Dahlqvist · November 19, 2018, 1:14pm

There is no direct link between shards and cores. You can run a node with 15 shards on a VM with a single core, although that may not be optimal for performance. When you run a query against a number of shards, each shard is processed in a single thread, although different shards naturally are being queried in parallel. The more cores you have, the more shards can be processed in parallel. If you have more cores than shards on a node, some may be idle unless you have more than one query running concurrently.

lfv89 · November 19, 2018, 6:39pm

Exactly what was I looking for. If I understood correctly then:

A cluster with more cores than shards wouldn't necessarily be a problem per se, since the load could be such that no cores would be idle, and in that scenario two or more cores (each with its own single thread) could be reading the same shard, in the same node at same moment, but for different queries.

Is that a reasonable summarization?

Christian_Dahlqvist · November 19, 2018, 6:56pm

Yes, I think that sounds correct.

lfv89 · November 19, 2018, 7:42pm

Perfect, thanks.

system · December 17, 2018, 7:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Optimal shards: 1 or number of nodes? Considerations Elasticsearch	10	5219	August 29, 2018
When do you need more then 1 shard? Elasticsearch	12	1853	July 6, 2017
Relation between shards and nodes Elasticsearch	5	1779	November 22, 2017
Shards per Host / Cores Elasticsearch	4	413	January 31, 2019
How many shards? Elasticsearch	2	254	July 6, 2017

About shards, nodes and cores

Related topics