Size of indices are too large

I am having logs of 6.9GB but the size of index created is more than 20GB. Is there any efficient way via which we can control the size of indices??????

I am assuming you mean total storage size is 20GB when you consider shards and replicas?

How did you ingest your logs? did you compress? If so how?

No I didn't compress it. I just uploaded the logs using logstash.

This is really an elasticsearch question and you would get better answers in that forum. By default ES keeps two copies of all the data, plus indexes that tell it which words occur where in that data. There are options for indexing the data that will reduce the size of the index (which also reduce its utility). It does not seem unusual to me for the overall data usage in ES to be three times the size of the logs loaded.

Is there any mechanism through which I can control the size???

You could eliminate the replica, which would remove half of the data, but that would mean you have no backup if there is a failure.

As I said, there are options to reduce the amount of indexing that ES does. For example, I had no use for proximity data and disabling that indexing option gave me a significant savings in storage.

Again, this is really an elasticsearch question and you will get better responses in that forum.

Next time I will be very precised regarding my post. Since you have been replying on this thread that's why I am continuing in this thread..But how am I gonna achieve this? How will I delete the half of data? And how will I define this in my grok so that during creation of indices it might become possible to delete the half of data automatically?

Hi Vikas,

There are several ways to achieve this.'

If your number of indices are less, You can go to Kibana=>management=>IndexManagement=>SelectYourIndex=>EditSettings
Here you can select "index.number_of_replicas": "0"

But with this move, you wont have any backup to your data.

Have you gone through this guide in the docs?

Yes...but it ain't much helpful because the logs are continuously being updated hence I am planning for iLm...Although since there is backup file for deleted indices how can I retrieve those backup file??

The size of your index and how this compares to the raw log size will largely depend on how much data you add during enrichment and what mappings you use. This blog post, which is now getting a bit old, discusses this. Although all individual details are no longer accurate as Elsticsearch has evolved, the overall concepts are still the same.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.