Analysing arbitrary log files (newbie)

Hi

I would like to ingest and correlate messages from a number of log files that I have on disk. I have installed Elasticsearch, Kibana and Filebeat. I have configured Filebeat to read the log files from a directory, and I can see some indications of that in the Kibana Discover page.

So, I have signs of life.

There are some basic points that I need guidance on please. I am hoping that there is a suitable tutorial for my use case, i.e. how to ingest and analyse arbitrary log files, that I could be referred to. So, the points below could be answered directly here (thank you), or via a pointer to a tutorial (thanks even more). I know there are lots of tutorials, but none that quite seem to fit my needs.

How do I remove the messages that are showing as a result of previous iterations of changing Filebeat config so that I only see the latest messages. As matters stand now, for example, I am seeing messages that appear to represent a directory listing, which is probably an artefact of an earlier configuration attempt.

How can I see what kind of data is being read from a given log file so that I can see whether Filebeat is making sense of the log file format?

Since Filebeat will likely get confused by some of these log files, what do I about that? For example, should I add a specific content format filter, and, if so, how?

Many thanks

Nathan

Hi mate!!!

First thing that calls my attention, is that you don't mention logstash. I don't know if that's an omition or in fact you did not installed it. In any case, logstash is a key piece when it comes to the part for processing your data.

I'm quite new to elastic my self, so I'll share with you what I've been up to, as I'm quite happy to where it is leading me in terms of knowledge...although I can see that there is a huge way ahead.

Now, to try to answer your questions:

--> What i usually do while testing things, is I just delete the "old" index all together and reindex the thing. I think there are other methods like using the _reindex API so you don't need to remove your original index...but in my case, deleting the old index and indexing again is just fine.

--> I don't quite understand this one, but I think that's because in my case, the one making sense of the file format is logstash. At this time, I'm using filebeat basically to just be the one that ships the "raw" information, with the only exception of the multiline configuration that I do at the filebeat level and adding one or two tags. When using logstash, usually any filter that fails to apply, will give as a result, a tag for that doc (line of the log), similar to _grokparsefailure or _dateparsefailure, meaning that that specific doc (line of the log) did not match any of respective the filters it went through (grok, date, etc). You can also add tags on failure with your own specific names.

--> Most of the filters I use with logstash, are "grok" filters. Grok filter plugin | Logstash Reference [8.11] | Elastic
Kibana comes with a "grok debbuger" that will let you test your grok filters and it works really great!

Good luck!!!

Thank you for following up.

I also thought it was odd not to have Logstash. I followed this guide to have Beats integrated with Elasticsearch, and that seemed to work (for a given definition):

That article notes that "Elastic has several methods for getting data in to Elasticsearch", of which Beats is one.

I started off here with the hope that I could just simply ingest some log file data and start to make sense of, but it is clearly going to much more complicated than that. I shall persevere, but I can't help wondering what trick I am missing, as there ought to be a simpler starting point ...

Many thanks also for your other pointers.

Regards

Nathan

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.