There we have an Elasticsearch cluster with three datanodes, so how many indices could it contains most?
Each shard is a Lucene instance and has some overhead, so how many indices and shards a node can handle typically depends on the available resources, e.g. heap, of the node as well as the use case.
Thanks for your help. Here's still some question.
- What's the largest size of each shard we should control?
- There are 20 categories data to input, almost 20GB total per day, so we have created 20 indices. Is this alright? How many shards should we set for each index?
Our cluster has 3 data-nodes with 4 cores 2.6GHz CPU, 32 GB mem foreach.
If you are using daily indices, generally aim to have shards between a few GB and a few tens of GB in size. The optimal size will vary depending on data, queries and latency requirements, but this is often an appropriate starting point that will allow you to store a good amount of data per node.
Whether you store different types of data in the same or different indices depends on how similar they are, but also on whether there are any mapping conflicts. If your different types of data do not have conflicting mappings I would suggest storing all categories in a single index with perhaps 2 or 3 primary shards. Having 20 daily indices, potentially with 5 primary and 5 replica shards each, for that data volume will result in very small shards and be very inefficient.
If you still need to store the data in a few different indices due to conflicting mappings, I would recommend using 1 primary shard per index in order to ensure that shards do not get too small.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.