Essentially what the topic says. Is it better to have 1 daily metricbeat index with many filter index aliases or many daily metricbeat indices with no filtered aliases in terms of optimizing cluster performance and health? The use case for this is we have metricbeat supporting any different teams each with their own servers. We like to be able to present these teams with unique indices that contain only their data. Both methods above would accomplish this, but my question is which is generally better?
Make sure you do not end up with a lot of small indices, which can be very inefficient. Depending on data volumes and retention period, it may make sense to use weekly or monthly indices rather than daily indices.
Do all teams have the same retention requirements? If so, sharing an index with aliaset might work well. If not, separate indices might be better. How many teams do you have?
Hey thanks Christian your blog post is actually what made me reconsider our setup. The teams have different retention requirements, but we use curator to clean up old indices so we can use the exclude index alias in the delete action without issue. Right now we have about 14 different teams with varying amounts of servers so data volume for each team is rather different.
We currently are giving each team their own daily index with the default 5 shards and 1 replica shard. I'm fixing this currently to make the shard size more even and larger as well as the number of shards less per index in the interim.
I'm leaning towards implemented the filtered aliases to make everything more manageable. Would that be the best course of action?
With curator you manage retention by deleting indices as this is much more efficient that using e.g. delete-by-query. If teams have different retention requirements and you do not want to keep data around longer than necessary, they should have different indices.
Aliases are quite light-weight but need to be stored in the cluster state, so are not "free". Determining the best way to organize data probably requires more information as e.g. similarity of mappings and differens in retention period and data volumes also play a role.
Ahh thanks Christian makes sense how curator manages deleting indices. I was misremembering how we currently use it in that we use exclude alias to exclude multiple individual indices that share and alias, not exclude some entries from the delete on one index through aliases.
In terms of mapping, all the metricbeat indices have the same, the only things that differ are the retention for some opposed to others as well as the size of the indices (they vary from ~1gb to ~16gb). What I was going to do now was change the sharding on each index so that they are more evenly sized and distributed.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.