Journalbeat - ship from journald to ELK


(Marcus Heese) #1

Hey guys,

I just wanted to drop a note that I'm in the making of writing a community beat that I call journalbeat.

I'm interested in getting some feedback on the idea of this project, and also - of course - to gather some ideas on what people would expect it to do.

Besides from the obvious use case (shipping logs), I'm also developing it to get a central and common data source for more advanced topics like FIM, SIEM, Audit Logs / Monitoring / Alerting. Input and ideas on this particular use case are highly appreciated.

The software is working so far (I tested and developed on Fedora 23), and has the following features:

  • starting following the system journal from 3 different locations: the beginning of the journal, the end of the journal, where you stopped parsing the last time
  • reads all journal fields into one event
  • adds journal catalog entries if possible
  • can normalize / "clean" field names (__REALTIME_TIMESTAMP becomes realtime_timestamp)
  • can try to convert number fields to numbers
  • can move all journal fields to an object field

My current plans are to add the following features near term:

  • whitelists and blacklists for fields
  • filtering of messages (like journalctl can do)
  • JSON parsing of message fields if possible

Any ideas / suggestions / feedback are highly appreciated!

Cheers
Marcus


(Tudor Golubenco) #2

Hi Marcus,

We're definitely interested in a Journalbeat, and really happy you started it. I think this will be one of the important Beats.

It's in some ways similar with Winlogbeat that @andrewkroh wrote for Windows event logs. One thing that Filebeat and Winlogbeat have is a "registry file", where they remembers how far it was acknowledged by the downstream servers to know what they have to re-send in case of restart/errors/etc. Does Journalbeat have or need something equivalent?

It's a bit unfortunate that go-systemd doesn't have a pure Go implementation for reading the journal files, but perhaps it will get one over time or we can contribute one. Until then, using Cgo is perfectly fine, and our build/packaging scripts support it.

We're happy to help in any way needed, just let us know here on in the #beats channel on freenode.


(Marcus Heese) #3

Hi Tudor,

Thanks for the input!

Re "Registry File":
Currently I'm not writing a registry file. I actually thought that this functionality is being taken care of by libbeat itself. Wouldn't it be the right place? I can imagine that this is a common functionality that every beat would like to have/implement. That way beats only need to call PublishEvent() and the rest is being taken care of by the framework. Nevertheless, I'll add that on my TODO list of things that I need to implement.

Re "go-systemd":
I don't think it is possible to come up with a pure implementation of go-systemd. Journald is not necessarily writing files that one could read with go (see the Storage= section of journald.conf). That is in particular actually important for my environment where I am not supposed to store logs / log files locally. With journalbeat I can now though ship everything off encrypted and compressed immediately which is one of the reasons why I'm writing the program.

I have actually forked go-systemd from the CoreOS repo and I plan to do pull requests so that I can use their main repository later on. The functionality I have added is unfortunately crucial (reading all log fields, catalog data, seek to head/tail/cursor, etc.), so I had to do a fork. I am mentioning it as a caveat because depending on the distribution, there might be different library dependencies and therefore it is probably necessary to do a different build for every distro.

I'll definitely drop by on freenode later today! Thanks for the input so far.

Cheers
Marcus


(Tudor Golubenco) #4

We are planning for PublishEvent()to have this kind of behavior, but for that we'd need it to be able to spool events to disk, which can be difficult. The way Winlogbeat and Filebeat avoid spooling events to disk, is by simply not reading new events if the current batch was not yet acknowledged. But to do this across restarts, it needs to remember were they left off, so they use a very simple json/yaml file which they persist to disk. Here is the Winlogbeat implementation, which you should be able to reuse.

If I understand correctly, volatile means storing in a ram disk? But actually I didn't mean to read the files from disk directly, but rather use whatever APIs the journald libraries use, but directly from Go. But this shouldn't be a priority, anyway, first important thing is to work for you :-). We can look later into making it more portable.


(Christoffb) #5

Hi Marcus,

I am very interested in Journalbeat. I am currently doing a logstash poc study and we need to ship our logs from journald. When can I start testing Journalbeat? Let me know if I can contribute development time.

Kind regards,
Christoff


(ruflin) #6

@christoffb I don't want to get ahead of @mheese but I assume the best way to contribute is opening PR's in the repo: https://github.com/mheese/journalbeat


(Marcus Heese) #7

Hi Christoff,

If you don't shy away from compiling Go source code yourself, you can actually start right away. The code that is available on the github repository (https://github.com/mheese/journalbeat) is working and delivers my test journald entries successfully to the ELK stuff (although the build instructions are not 100% correct I noticed the other day).

Besides from that, I'm planning on releasing binaries within the next 2 weeks (probably by the end of this week).

Regarding development time: pull requests are always welcome of course :slightly_smiling: however, if you have feature requests, please create issues on the github page and we can discuss development of them.

And in general I guess that the main difficulty with this beat will be the testing on all the different distributions and the different systemd versions.

Thank you for your interest by the way :slightly_smiling: I'm happy to see that others also want to have a journalbeat.

Regards
Marcus


(Marcus Heese) #8

@ruflin, yeah sure ... PR's are always welcome :slight_smile: ... Although I always like opening issues first. That way not 2 people are developing the same thing.


(Tfendt) #9

Is this still being worked on? AWS has switched their default AMI to use Ubuntu 16. Ubuntu 16 by default now uses systemd which writes to journald. We would like to keep using journald instead of having to manage our application log files separately but it doesn't look like there is a beat for it and I would rather not install a heavy logstash client on each application to forward logs.


(ruflin) #10

journalbeat is a community beat. There is also an ongoing to discussion of support for journald should be added potentially to filebeat. Feel free to open a feature request on our github repo for it.


(Arindam) #11

Hi! I have just started working on this project. I am going through the code as we speak.
@ruflin how do I open a feature request for the one you just mentioned?


(ruflin) #12

Just post an issue here: https://github.com/elastic/beats


(system) #13