Question about creating Brobeat

I am working on a new beat called brobeat here -

However, it looks like what I want is to build a filebeat module? Does it make sense for me to create a copy/fork of filebeat and add my bro-module?

All of the community beats seem to consume some API. What I need is to read log files off the disk and ingest them into elasticsearch and 'grok pattern match them' and maybe rename a few fields etc etc.

I am trying to avoid using Logstash if I can for now.

So I guess that libbeat doesn't do a logstash like function, it just ships logs, but can tell an Elastic Ingest node what pipeline to use?

So I would have to define a bunch of ingest-pipelines?

What do you think my path should be?

I would also love to talk to someone about how the filebeat modules integrate into filebeat. I don't see mentions of them in the filebeat golang code? Maybe it is happening deeper in libbeat somewhere?

Any help MUCH appreciated.


1 Like

It looks like there is new functionality for filebeat that IS called filebeat-modules -

It sounds like that is what I am trying to build? Are you going to make a cookiecutter for filebeat modules and have a way to install them like plugins to filebeat?

Why not just use filebeat and an ingest node?
Rebuilding filebeat just for one log type seems like overkill :slight_smile:

The Filebeat modules will contain Ingest Node configurations + prospector configuraiton + fields defintions + docs + etc. Here is an example for nginx access logs:

It sounds like what you want to build will match FB modules quite well, but they are currently heavy work in progress (we're just passing the prototyping phase), so it's a bit early to contribute to them.

But you can already create the Ingest Node pipeline configuration and load it manually or with a script into Elasticsearch. What Filebeat modules will bring is a little bit of automation around loading all the necessary files.

1 Like

Do you have any timeline or a feature-branch I could watch in the mean time?

Very exciting stuff!

Prototype is in master branch. Currently it's a python based filebeat wrapper Don't expect this to be the final outcome. The prototype allows us to mostly play/change the feature without much effort, to develop a good idea how the final module support will work.

I have a question about filebeat modules and ingest nodes. BroIDS logs have a bunch of boiler plate headers that include log field names/types.

Is there a way to parse that? I tried to generate patterns here -, but what I don't understand is what happens when an optional field is not there? I assume the grok pattern fails?

Is there a way either in the ingest.json grok to label a field as optional or have it parse the log file header to see what fields are actually there?

This has me leaning back to the idea of a full beat that parses and understands the bro-log format?

Thanks for all the quick and insightful responses!

I have also taken a first stab at creating the bro filebeat modules here -

I haven't tested them yet.

I tested out the logstash conf/patterns last night and they were failing to parse the http.logs :cry:

Also I tried just using the logstash default bro grok-patterns and those also failed. I think it is something wrong with the way bro generates it's logs that the fields are not static and change a lot depending on what plugin/modules you have enabled.

So this has me leaning again more towards a full beat or maybe I just am not very good at grok?

Perhaps it is worth to post a question in the Logstash forum about the bro patterns not working. I would hope that Logstash can actually deal with almost all cases somehow :slight_smile:

For the separate beat: It sounds kind of overkill to create a beat for in case it is still a log file in the common sense that logs are written line by line. But I don't know the details about the Bro NSM logs.

There's several tricks you can use:

  • enclose fields in (...)?to make them optional
  • use multiple patterns, which are tried in order
  • use on_failure modes
  • use conditionals in Logstash

For the ingest node, I recommend playing with the simulate API to figure it out:

So I will look into it again, but I still don't understand how it can work.

Let me give you an example:

fields: (ts:double, required), (ip:ip, required) (port:int, optional) (f1:int, optional)(f2:int, optional)(f3:int, optional)(f4:int, optional)

what if in one log there is f1 and f4 and then in another there is f2, f4.

How would logstash or a ingest node's grok pattern know to label the two int fields as (f1,f4) or (f2,f4) ?

I created a question in the Logstash topic here: Logstash-patterns-core BRO_* not working?

This topic was automatically closed after 21 days. New replies are no longer allowed.