I read through many guides and currently I'm defining an index for our production Cluster (three nodes for the start).
I have questons about the following things:
- Is the document count relevant at all for an index ? In some guides they decide the shard count on document count. I thought the amount of space the documents need is the factor to look at ?
- We have round about 25 GB logs a month. So a daily index is the wrong thing to do, but I wonder how many shards I should choose for the monthly index. The data size suggests one primary shard, but then I cant scale the system or use all existing nodes for ingesting data. These were my considerations (all with an monthly index in mind):
1 primary shard + one replica to get more query Performance
3 primary shards + one replica to have data from one index on every node for better ingest and query Performance. Also if the amount of data grows one shard wont get over the suggested size of max 50 gb
4-6 Primary shards + one replica for overallocation to use future nodes (dont know if this will happen)