Number of shards in 4 node Cluster

Hi All,

Is there any best practices of having on the number of shards for a
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number is
high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

What sort of data do you have, time based or static? If it's the former
then going with any arbitrary number is less of a problem as you can change
this the next roll over period. If it's static then 4 would be a good start.

There aren't any metrics around this, other than not creating a large
number to start with, as each shard is a lucene instance and does take
resources.

On 17 March 2015 at 11:00, John S bunix1@gmail.com wrote:

Hi All,

Is there any best practices of having on the number of shards for a
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number
is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_B9mxm9xnJtzoSc-tj1G-MoZ7vdQ-ye%2B7woLfj7aRHJw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I typically suggest to start with the default of 5 shards. A single shard can hold several tens of gigabytes. Certainly in your case it seems like 20 shards is overkill for a 4 node cluster.

On Mar 17, 2015, at 11:00 AM, John S bunix1@gmail.com wrote:

Hi All,

Is there any best practices of having on the number of shards for a cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/6e51f1e4-8938-4196-84a9-007705869b6a%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DD2AA858-ABD4-49F5-9F9C-D73C01F615CE%40elastic.co.
For more options, visit https://groups.google.com/d/optout.

My rules is : 1 primary shard per server.

Also make some estimation how big will be the single index/shard

I think it is not good if single shard exceed 10 GB, although there is no
exact limit.

Georgi

On Tuesday, March 17, 2015 at 7:00:23 PM UTC+1, John S wrote:

Hi All,

Is there any best practices of having on the number of shards for a
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number
is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

We recommend shards no larger than 50GB, but as you mention there is no
exact limit.

On 18 March 2015 at 04:09, Georgi Ivanov georgi.r.ivanov@gmail.com wrote:

My rules is : 1 primary shard per server.

Also make some estimation how big will be the single index/shard

I think it is not good if single shard exceed 10 GB, although there is no
exact limit.

Georgi

On Tuesday, March 17, 2015 at 7:00:23 PM UTC+1, John S wrote:

Hi All,

Is there any best practices of having on the number of shards for a
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number
is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-iUMdUXTS31uHuJ8FyXJy7vNqCrP_gw0tSs1xNFkzz%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

As Mark said, there is no hard limit on how big a single shard can be, but just so it’s clear, 10GB is actually quite small for a single shard. It’s not at all uncommon for me to see shards with upwards of 60 GB or more.

On Mar 18, 2015, at 4:09 AM, Georgi Ivanov georgi.r.ivanov@gmail.com wrote:

My rules is : 1 primary shard per server.

Also make some estimation how big will be the single index/shard

I think it is not good if single shard exceed 10 GB, although there is no exact limit.

Georgi

On Tuesday, March 17, 2015 at 7:00:23 PM UTC+1, John S wrote:
Hi All,

Is there any best practices of having on the number of shards for a cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/39389779-986A-4933-89BD-7B842B168EA0%40elastic.co.
For more options, visit https://groups.google.com/d/optout.

Hi Mark,

may I ask what the reason for this recommendation is?

Thanks,
Andrej

Am Mittwoch, 18. März 2015 17:50:09 UTC+1 schrieb Mark Walkom:

We recommend shards no larger than 50GB, but as you mention there is no
exact limit.

On 18 March 2015 at 04:09, Georgi Ivanov <georgi....@gmail.com
<javascript:>> wrote:

My rules is : 1 primary shard per server.

Also make some estimation how big will be the single index/shard

I think it is not good if single shard exceed 10 GB, although there is no
exact limit.

Georgi

On Tuesday, March 17, 2015 at 7:00:23 PM UTC+1, John S wrote:

Hi All,

Is there any best practices of having on the number of shards for a
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards number
is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/385c959b-0132-4d9d-95cd-b4b08fca9c94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Part of it is based on knowledge picked up from our customers, part of it
is that once you have to start shifting files larger than this around
(during reallocation or recovery) it can take excessive time.

There is also a ~2 billion hard limit for documents in a single shard,
which is a lucene limit, so you reduce your exposure there too.

On 18 March 2015 at 10:29, Andrej Rosenheinrich <
andrej.rosenheinrich@unister.de> wrote:

Hi Mark,

may I ask what the reason for this recommendation is?

Thanks,
Andrej

Am Mittwoch, 18. März 2015 17:50:09 UTC+1 schrieb Mark Walkom:

We recommend shards no larger than 50GB, but as you mention there is no
exact limit.

On 18 March 2015 at 04:09, Georgi Ivanov georgi....@gmail.com wrote:

My rules is : 1 primary shard per server.

Also make some estimation how big will be the single index/shard

I think it is not good if single shard exceed 10 GB, although there is
no exact limit.

Georgi

On Tuesday, March 17, 2015 at 7:00:23 PM UTC+1, John S wrote:

Hi All,

Is there any best practices of having on the number of shards for a
cluster? I have a 4 node cluster and used shards of 20.

During any node failure or other events i doubts since the shards
number is high, replication to new node is taking more time...

Is there any metrics or formula to be done for number or shards?

Regards
John

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/36ef3ed0-870f-41a5-915b-fb3ad919f7a0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/385c959b-0132-4d9d-95cd-b4b08fca9c94%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/385c959b-0132-4d9d-95cd-b4b08fca9c94%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8ajDqiyn04JXqH9RcBU8vaOLKZkEdFRYnNWFiV0ihw9Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.