I add to Radu's answer that if you have many shards on a single node, you can hit the "too many open files" issue. Each shard is a full Lucene instance.
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 25 oct. 2012 à 11:46, Radu Gheorghe email@example.com a écrit :
The problem with having many shards is that each shard has an
overhead. So while it's OK to over-shard in order to scale out, if you
exaggerate things will get slow, especially on the query side.
Luckily, there are other solutions than starting off with a huge
number of shards. It depends on your data to choose what fits best.
For example, you can add more indices as you go along.
I suggest you take a look at this video, as it's exactly about this topic:
http://sematext.com/ -- ElasticSearch -- Solr -- Lucene
On Thu, Oct 25, 2012 at 10:05 AM, Jerry Chou firstname.lastname@example.org wrote:
For each cluster, I know that I can scale out query capacity by adding
But is it possible that I can scale out in data volume?
As I know all data are written into one of the primary shards first, then
copied into replicas.
However the number of primary shards has to be defined at the beginning.
pose a limitation on the max number of instances(for primary shards) in the
The default number of primary shard is 5. For future scalability, is there
any drawback if
I set it to a big number?