Scale for the size and number of active indices

sharon.c · July 6, 2016, 11:19pm

I have 15 GB or 80,000,000 documents generated everyday streamed from logstash pipeline into elasticsearch, and I need to allow 30 days data to be active and queriable. My queries are all aggregation quries with multiple layers of bukets.

I have two choices to index the data

Generate one index daily

logstash code:
...
output {
elasticsearch {
hosts => "staging-elkstack:9200"
index => "name_%{+YYYY.MM.dd}"
...
}
}

Generate one index weekly

logstash code:
...
output {
elasticsearch {
hosts => "staging-elkstack:9200"
index => "name_%{+yyyy.ww}"
...
}
}

I tried generating index daily, but it looks like the more indices there are, the more CPU it is going to consume during aggregation queries. The cpu usage actually spiked to 80% some times when the index number is 20.

I want to try to generating index weekly, but the indexing speed looks slower than the generating index daily. I think it is because the when the index document number is large in one index, the index speed is going to decrease.

My observation might be wrong, can anybody correct me?

Does anybody have the same concern like that? Is there any rule of thumb scale for the size and number of active indices?

How can I generate index every 3 days?

dadoonet · July 7, 2016, 3:47am

How many shards in both cases?

sharon.c · July 7, 2016, 12:53pm

2 shards

dadoonet · July 7, 2016, 1:19pm

I believe this is where the difference is.

If we take a week period, in one case you defined:

One index per week with 2 shards: 2 shards for the period

In the other case, you defined:

One index per day with 2 shards: 7 * 2 = 14 shards for the same period

So may be you could try to index in weekly indices with 14 shards instead of 2 and see how it now compares?

sharon.c · July 7, 2016, 1:58pm

Thank you for your suggestions!

Topic		Replies	Views
Daily index and monthly index query performance difference Elasticsearch	3	1064	December 22, 2020
Scaling and Optimisation advices Elasticsearch	6	580	June 22, 2018
Index size questions Elasticsearch	3	753	September 28, 2017
One large index vs. many smaller indexes Elasticsearch	5	10651	July 6, 2017
[Help!] Number of indexes and shards per node Elasticsearch	9	3445	May 5, 2017

Scale for the size and number of active indices

Related topics