Advice on number of shards


(Rafał Kuć) #1

Hello,

I have a question. We will be migrating our client current Lucene
index to Elasticsearch. Right now, the index contain 3 million
document, with the planned growth of about 1 million a month. Those 3
million of documents is about 350GB in size divided into 4 machines.

We would like to plan the Elasticsearch shard number, to be able to
easily add nodes in the near future. What do you think on about having
40 shards for that amount of data ?

Regards,
Rafał Kuć


(Shay Banon) #2

40 shards sounds a bit problematic on something like 4-5 nodes, because a
single search request will need to span all 40 shards (unless you use
something like routing). See more info here:
https://groups.google.com/forum/#!searchin/elasticsearch/data$20flow/elasticsearch/49q-_AgQCp8/MRol0t9asEcJ
.

On Mon, Jan 23, 2012 at 3:50 PM, Rafał Kuć rafal.kuc@gmail.com wrote:

Hello,

I have a question. We will be migrating our client current Lucene
index to Elasticsearch. Right now, the index contain 3 million
document, with the planned growth of about 1 million a month. Those 3
million of documents is about 350GB in size divided into 4 machines.

We would like to plan the Elasticsearch shard number, to be able to
easily add nodes in the near future. What do you think on about having
40 shards for that amount of data ?

Regards,
Rafał Kuć


(Rafał Kuć) #3

Thanks for the reply Shay. Forgot to mention, that we will be using
routing and the queries won't be hitting all of the shards, just the
one specified by the routing.

Regards
Rafał

On 23 Sty, 20:07, Shay Banon kim...@gmail.com wrote:

40 shards sounds a bit problematic on something like 4-5 nodes, because a
single search request will need to span all 40 shards (unless you use
something like routing). See more info here:https://groups.google.com/forum/#!searchin/elasticsearch/data$20flow/...
.

On Mon, Jan 23, 2012 at 3:50 PM, Rafał Kuć rafal....@gmail.com wrote:

Hello,

I have a question. We will be migrating our client current Lucene
index to Elasticsearch. Right now, the index contain 3 million
document, with the planned growth of about 1 million a month. Those 3
million of documents is about 350GB in size divided into 4 machines.

We would like to plan the Elasticsearch shard number, to be able to
easily add nodes in the near future. What do you think on about having
40 shards for that amount of data ?

Regards,
Rafał Kuć


(system) #4