Hi folks,
I'm doing "go-live" planning for prod-ELK, and i stumble upon few nice calculations formulas form official elastic webinar on "capacity planing and sizing".
I have few confusions in shards calcs based upon formula below:-
Total shards=number of index patterns x number of primaries x (number of replicas +1) x total interval of retention
In my environment I have
number of index patterns = 20
number of primaries = 5
number of replicas=5+1
total retention period =30
no of shards=18000
If I have two nodes, this will come down (20 index total) as 9000 shards per node to 450 shards per index.
I'm not sure about best practice but there is a performance limit on # of shards per index, if that will increasing the # of primaries help me manage it.
If you have 20 different indices, require 30 days retention period and use daily indices I would recommend setting the number of primary shards to 1 for all indices, which is the default in newer versions. This will give you 1200 shards as you have one replica shard.
You can reduce this further by using rollover and let each index cover a longer time period than a day if the shard size is below a threshold.
There is a limit of 1000 shards per node but you should aim to have fewer, as described in this blog post.
I would also recommend to have at least 3 master eligible nodes in the cluster as this gives you the ability to. continue operating if one node goes down.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.