Total shards per node calculation

I've been working on Elastic for a while now but just wanted to do a quick check on learning that may also be useful to others. I've got a customer that would like 4 separate indices for logs. The only difference between them is that they come from different environments, ie Prod and Dev and slightly different file paths. We have 3 availability zones in our ECE deployment, the hot tier has 64GB allocated to it which should mean we have 32GB of heap space per node (also a check on learning) and we utilize the 3 primaries to 1 replica ratio for our indices.

If my understanding of shards is correct we'd end up with 24 shards, 6 per index and 2 per AZ. In keeping with the recommendations from Elastic, "The number of shards a data node can hold is proportional to the node’s heap memory. For example, a node with 30GB of heap memory should have at most 600 shards. The further below this limit you can keep your nodes, the better. If you find your nodes exceeding more than 20 shards per GB, consider adding another node.", would indicate to me that the best way to implement the request, while also "future proofing" the stability of the deployment, is to utilize the saved search function in Kibana.

In doing this we'd only have two indices, we'd potentially add more tags or fields within the Filebeat config, and the user would easily be able to sort through different environments/file paths with the extra data. We'd then be down to only 2 indices, 12 shards and 2 per AZ. This seems like the better alternative as the customer base grows and more Filebeat indices grow. It's also a bit easier to manage in the Logstash output section. Is using the saved search function the best option for scenarios like this?

1 Like

How much data are you putting into an index?

In this instance the indices will probably be under 1GB but we do have customers that have indices that run from the 10-30GB range for their logs.

Splitting things out into environments makes sense. As does putting sources with different formats into different indices (to save on mapping issues).

I'd be using ILM, this will greatly help with potential shard explosion. Saved searches will help if you put things into the same indices, but they aren't a real solution.

Also, if you are using ECE then you should definitely run this via Support.

Sounds good. ILM will definitely be implemented with these indices. No need to hold on to logs longer than we have to.

I use the ECE support page A LOT. I figured that this one might also be beneficial to the community and new users so I put it up in here.

Appreciate the help Mark, enjoy the rest of your day.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.