Struggling with storage management

I'm at about 50 windows servers that index weekly (i.e. winlogbeat 2018.34 would be the latest week of the year)
Storage is coming out to be about 10 gigs a week, depending on amount of activity on the servers. This is simply not sustainable, especially considering all of the other systems that are also generating another 10 gigs a week.

My question is - how are people managing their storage?
I'm thinking the biggest problem is that I'm indexing almost 1000 fields for the winlogbeat events. I plan on mitigating this through the logstash config. Is anyone doing something similar? It would be ideal to be able to define 20-30 necessary fields that need to be indexed, and then just dumping the rest of the data into a misc field. Does that sound reasonable?

Best compression is already enabled as for me searching performance is at the bottom of the priorities list. It is odd to me that compression is not better. I have millions of log events that almost exact copies of one another because they are just log in / log out windows events from our authentication management system.

I'm pretty new to the elasticstack. Currently I have a 1 node cluster in my environment as I've been learning. I plan to scale this out once I have storage under control.

Thanks. Any advice is appreciated.

1 Like

You could also store them but make them non-indexable (ie searchable). So you'd then search for something on those 30ish fields but you'd still be able to see the values in the events.

You could also look at the aggregate filter in Logstash to trim things down.

1 Like

Could I test the storage savings of not indexing all of the fields? I.E would closing an index also make all of its fields non-searchable?

It makes the entire index non-searchable.

Right, but would the storage reflect that? I'm trying to figure out of it's even worth the time cleaning these fields up.

It will still exist on disk, you just won't be able to see it via the APIs until you open it back up.

Which version of Elasticsearch are you using?

Gotcha.
Is there some guidelines on how to configure fields in the logstash config file? I'm struggling with the documentation. Ideally, I would want to say exclude all except for the following 30 or so fields.

Thanks, appreciate the help.

6.3.2

It is great to hear that you are on a recent version, as a lot ofimprovements around the ability to handle large number of sparse fields have been added lately. If you were on a version prior to 6.0 this would have had a much more significant impact.

1 Like

Hey Mark,
Just wanted to follow up and see if there's an easy way for me to map the explicit fields I want to include for indexing.

Thanks

You need to do that in a template, it's not a Logstash feature.

Mark,
I'm considering this:

Do you have any experience with this? Opinions?

EDIT: is this a paid gold or platinum feature?

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.