Is there an ideal amount of indexes to have and if so why? I am trying to
figure out which indexes to store my types under and wanted to make sure I
was considering all of the factors.
A rule of thumb could be: number of shard per node equal to number of cores (processor).
So if you have 3 nodes, 8 cores per node, you have 24 cores in your cluster.
If you want 1 replica per shard, you can think that the number of primary shards in 12.
It could be 12 index with 1 shard or 1 index with 12 shards...
But it also really depends on your usage. If some shards are less used than others (think time based indices), you can have more shards on a box than the number of cores.
Ideally, you should test it on the targeted hardware to find your magic numbers.
My 2 cents.
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Is there an ideal amount of indexes to have and if so why? I am trying to figure out which indexes to store my types under and wanted to make sure I was considering all of the factors.
Elasticsearch Head is very popular, and a wonderful tool.
Elasticsearch Head's formatting is broken for 11 or more shards
(double-digit shard numbers).
So my rule of thumb is a maximum of 10 shards.
But seriously, I've had great success with 97M+ documents in 10 shards,
whether in a 1-machine cluster (my laptop) or a 3-machine cluster (Solaris
x86-64 VMs; shared CPUs I'm sure).
What I'm waiting for are the final fixes to the split-brain issues that
some have seen. ES 0.90.0 is working fine in this environment so far,
though.
On Monday, August 5, 2013 6:50:47 PM UTC-4, iman...@gmail.com wrote:
Is there an ideal amount of indexes to have and if so why? I am trying to
figure out which indexes to store my types under and wanted to make sure I
was considering all of the factors.
For the sake of completeness, you can over-allocate shards, so the number
of indexes is no longer dependent on the hardware resources. No need to
worry for an ideal number of indexes.
That is, you create one index over all nodes. Then you can configure index
aliases on this index, with a tag field in the documents to filter out the
docs of your alias. To the clients when using the ES API, it looks like you
have many indexes (if you can post-process the _index field in the response
and enforce the tag field in the mapping). You can configure tens,
hundreds, thousands of alias indexes, without having to worry about the
organization of concrete indexes.
Still, there is a dependency, you should not let the shards volume grow too
large. This depends on your performance requirements and what your node
capacity is, how fast you want merging, recovering, moving shards etc.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.