Trying to understand "indexes" and fields to store logs


I'm new to Elasticsearch and I'm trying to figure out the best practices on storing logs.

What I've understood it's good to have one index for all logs and increment the index name by day, weeks, months, whatever suits (logs-2019.05.21). But now what I'm confused about, is when there are many different log sources, there are many different fields.

For example, lets say these are the logs and fields:
cloudfront-access-log: status, ip , referrer, time_taken, location
apigateway-access-log: status, ip , user, stage, exection_time

In this case, both have couple similar fields, but also completely different fields. Now let's say there are many more logs with many more fields that one has and the others don't. Should I be concerned about it at all? What is the best way to think about this?

I'm trying to keep in mind best practices and make sure there are no performance bottleneck from this step when scaling up.

Welcome :smiley:

Depends, if they are the same sort of log, then yep. If they are totally different formats then best to separate em.

Yep! Or we'd probably recommend using index lifecycle management these days, it does the same thing but is much easier to manage and has heaps of other benefits.

Yep. That goes back to my first comment. Group similar things and avoid having too much disparity in the one index.

Thank you! That clears it up for me.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.