Question about creating Brobeat

I am working on a new beat called brobeat here - https://github.com/blacktop/brobeat.

However, it looks like what I want is to build a filebeat module? Does it make sense for me to create a copy/fork of filebeat and add my bro-module?

All of the community beats seem to consume some API. What I need is to read log files off the disk and ingest them into elasticsearch and 'grok pattern match them' and maybe rename a few fields etc etc.

I am trying to avoid using Logstash if I can for now.

So I guess that libbeat doesn't do a logstash like function, it just ships logs, but can tell an Elastic Ingest node what pipeline to use?

So I would have to define a bunch of ingest-pipelines?

What do you think my path should be?

I would also love to talk to someone about how the filebeat modules integrate into filebeat. I don't see mentions of them in the filebeat golang code? Maybe it is happening deeper in libbeat somewhere?

Any help MUCH appreciated.

Thanks!

1 Like

It looks like there is new functionality for filebeat that IS called filebeat-modules - https://github.com/elastic/beats/pull/3158

It sounds like that is what I am trying to build? Are you going to make a cookiecutter for filebeat modules and have a way to install them like plugins to filebeat?

Why not just use filebeat and an ingest node?
Rebuilding filebeat just for one log type seems like overkill :slight_smile:

The Filebeat modules will contain Ingest Node configurations + prospector configuraiton + fields defintions + docs + etc. Here is an example for nginx access logs: https://github.com/elastic/beats/tree/master/filebeat/module/nginx/access

It sounds like what you want to build will match FB modules quite well, but they are currently heavy work in progress (we're just passing the prototyping phase), so it's a bit early to contribute to them.

But you can already create the Ingest Node pipeline configuration and load it manually or with a script into Elasticsearch. What Filebeat modules will bring is a little bit of automation around loading all the necessary files.

1 Like

Do you have any timeline or a feature-branch I could watch in the mean time?

Very exciting stuff!

Prototype is in master branch. Currently it's a python based filebeat wrapper filebeat.py. Don't expect this to be the final outcome. The prototype allows us to mostly play/change the feature without much effort, to develop a good idea how the final module support will work.

I have a question about filebeat modules and ingest nodes. BroIDS logs have a bunch of boiler plate headers that include log field names/types.

Is there a way to parse that? I tried to generate patterns here - https://github.com/blacktop/brobeat/blob/master/logstash/patterns/generated-bro, but what I don't understand is what happens when an optional field is not there? I assume the grok pattern fails?

Is there a way either in the ingest.json grok to label a field as optional or have it parse the log file header to see what fields are actually there?

This has me leaning back to the idea of a full beat that parses and understands the bro-log format?

Thanks for all the quick and insightful responses!

I have also taken a first stab at creating the bro filebeat modules here - https://github.com/blacktop/brobeat/tree/master/module/bro

I haven't tested them yet.

I tested out the logstash conf/patterns last night and they were failing to parse the http.logs :cry:

Also I tried just using the logstash default bro grok-patterns and those also failed. I think it is something wrong with the way bro generates it's logs that the fields are not static and change a lot depending on what plugin/modules you have enabled.

So this has me leaning again more towards a full beat or maybe I just am not very good at grok?

Perhaps it is worth to post a question in the Logstash forum about the bro patterns not working. I would hope that Logstash can actually deal with almost all cases somehow :slight_smile:

For the separate beat: It sounds kind of overkill to create a beat for in case it is still a log file in the common sense that logs are written line by line. But I don't know the details about the Bro NSM logs.

There's several tricks you can use:

  • enclose fields in (...)?to make them optional
  • use multiple patterns, which are tried in order
  • use on_failure modes
  • use conditionals in Logstash

For the ingest node, I recommend playing with the simulate API to figure it out: Simulate Pipeline API | Elasticsearch Guide [5.1] | Elastic

So I will look into it again, but I still don't understand how it can work.

Let me give you an example:

fields: (ts:double, required), (ip:ip, required) (port:int, optional) (f1:int, optional)(f2:int, optional)(f3:int, optional)(f4:int, optional)

what if in one log there is f1 and f4 and then in another there is f2, f4.

How would logstash or a ingest node's grok pattern know to label the two int fields as (f1,f4) or (f2,f4) ?

I created a question in the Logstash topic here: Logstash-patterns-core BRO_* not working?

This topic was automatically closed after 21 days. New replies are no longer allowed.