Fluctuating Index Sizes


#1

Hi,
We're seeing significant fluctuations in index sizes in our log aggregation cluster for old indices that are no longer being updated. For example, the index for Dec 12th doubled in size in the days after its last update before reducing in size by around 15%.

Can someone help explain what is/may be going on here, how we can track the processes at work and what we can do to deal with it?

Regards,
David


(Mark Walkom) #2

It could be due to merging segments, with sparse docvalues it can increase the size of the segments.


#3

Hi,
The thing is we optimise all of our indices after 1 day and the index I'm referring to here was growing after 3 days. Ended up being 2.2TB in size based on around 500 million docs before settling back to 1.7TB. When the last write came in the index was around 950GB!

BTW, is there a way of measuring the sparsity of our field data? Would I need to run lucene commands to do that? We're running on 2.1.2...

Regards,
David


(Adrien Grand) #4

You can run an exists query https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html on a given field and compare the result with the number of docs that you have in your index.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.