Unexpected large disk usage in ES

Hi,

Every day, I am sending around 5M number of logs to ES. Most of the logs are around 1.7k bytes in size each and very few special logs with around 40k bytes each. The special logs only occupy 0.4% of total logs, i.e. 20k number of special logs per day.

For these logs, I found that I am using around 10G disk storage per day. I don't think my logs are very sizeable to use up this rate of disk usage. So I have two questions.

  1. Is 5M number of logs indeed large enough to require 10G disk storage?

  2. Are there any optimization methods can be used to compress data or reduce disk storage?

Michael

Hey,

you may want to read about some older blog posts about Elasticsearch storage requirements.

Note, those blog posts are rather old and parts of it are out-of-date, but may give an idea where to save some space in the first place. If you dont need to make a field searchable, then dont. Also, with Elasticsearch 5.0 new number based types (scaled float, half float) were introduced to reduce storage needs further. Make sure you are using an index template as well (see logstash for an example). Dont have sparse fields.

Hope this helps as a start (it's not more than that, but getting your mapping right is the most important step to keep storage requirements at bay).

--Alex

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.