What is the ratio between raw data and ingested data that is stored in Elastic cluster

Hi,

I want to know what the ratio is between raw data and ingested data that is stored in Elastic cluster.

I know raw logs and ingested logs are not same in size.
Also is there any way to find the incoming raw log volume to the cluster?

I could calculate ingested log volume using index sizes.

Thank you...!

There is no fixed ratio as it will depend entirely on how your data looks like and your mapping.

You will need to test yourself with the data you are planning to index and the mapping.

This blog post is a little old, but explains a little how compression works in Elasticsearch, note that things already improved on recent versions.

1 Like

Hi Leandro,

Thank you for the reply.
Actually I don't need exact ratio. I want to get the raw log data volume per day for Elastic On-prem.
Is there any way to get the raw log volume, which would be great?

I'm using Elasticsearch 8.1.2

Thank you...!
Hiruni

If you install the mapper size plugin you can use this to aggregate on the ingested message size and get an estimate that way.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.