What is better. Monthly Indices or 1 Index with more shards?

defalt · September 17, 2020, 1:03pm

Hi,
we are currently planning a cluster which takes about 1TB/year worth of data. When using a standard shard size of 5 this would mean that the shards are 250GB. As far as I know shards should only be around 50-70GB at max. So we have 2 Options:

Increase the shard size to 20 or more.
Create a new index every month or so.
Questions:
How can l configure logstash so that it uses the right index when a new index is created? Zero downtime is required, thats why using one index seems way easier.
What creates the new index, where is it configured.
I can access all indices at the same time with an alias right?

Thanks

warkolm · September 17, 2020, 10:03pm

Is it time based data?

defalt · September 18, 2020, 7:09am

Yes, we run 50 Tests over the course of a day and this data is sent to elastic.

warkolm · September 19, 2020, 5:59am

Ok, why not use ILM?

Christian_Dahlqvist · September 19, 2020, 6:09am

An important question here is how long you need to keep the data. The by far most efficient way to delete data in Elasticsearch is to delete complete indices, and this is one of the main reasons why time-based indices are used. If you have a single index you need to delete using delete-by-query, which is much less efficient and will cause a lot higher load on your system.

I would recommend using time-based indices. You can either use rollover to create new indices based a combination of size and/or age or just make Logstash create indices with fixed periods by specifying a date pattern for the index. You can see an example of this here, just leave the dd portion for date out to create a monthly index.

Irrespective of whether you use rollover or time-based indices based on the index name you can use ILM to manage the rollover (if applicable) and retention.

system · October 17, 2020, 6:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is Daily-Index better than Monthly-Index Elasticsearch	6	1965	May 26, 2020
Is greates store logs into one indice or store in N small indices? Elasticsearch	5	294	July 29, 2021
Tradeoffs for using week/month (time) based indices Elasticsearch	3	373	July 6, 2017
Index houskeeping (ILM) Elasticsearch rollups	6	421	March 7, 2022
Index roll over every 30 minutes Logstash	11	1468	January 8, 2022

What is better. Monthly Indices or 1 Index with more shards?

Related topics