Is Daily-Index better than Monthly-Index

Hi,

My Elasticsearch index is time-based: one index per month at this scale:

  • number of Documents: 100M
  • Size: 125 GB

The index was created with 12 shards + 2 replicas.

Since the date range for most of my user's search requests are only across 2-3 days, does it make sense to create one index per day and use alias to join them? Will this improve query performance?

That's probably too much for that amount of data.

Look at using ILM instead.

This is just for 1-month of data. We have kept 5-years of such data. Is it too much?

Also, what's the rule to determine the number of shards?

It really depends, but we suggest no more than 30-50GB per shard.

One more information I forgot to mention: our Elasticsearch cluster has 24 nodes. Is it still too much for one index to have 12 shards + 2 replicas?

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.