Logstash Plugins

I am trying to parse a single line of unstructured data and I believe I am over complicating the matter. I am a log time SIEM engineer and content developer. That said, I am both in awe and just a tad overwhelmed in light of how awesome this is and how much resources are available.

My main question is this: I want use Logstash without Elasticsearch and Kibana and pull from a file and output the filtered data to a file? I am sure I can but what would be the best way of deploying this quickly?

Does anyone recommend any specific plugins for a lightweight task aside from the file plugins for input and output? Filtering I am using grok - which is AWESOME -- but are dissect and mutate better or more time-friendly for someone who just needs to parse so as to map a couple tags and rename a field or two?

I am a newb with this but I will say that in the past week I have learned a great deal but I am missing my deadline because I am experimenting with so many possible resources and such that I am a little bit lost,

Anyone have any ideas?

It really depends on how your logs looks like, how frequently you will need to parse the files, what is the output that want etc.

Elasticsearch is just one of the main outputs plugins in logstash, you do not need it, you can output to a file without any problem, actually I would say that this file input -> filters -> file output is a pretty common pipeline, I do that frequently when testing new things in my logstash pipelines.

But sometimes you do not even need the file input, you could use the stdin input and pipe the results to the logstash pipeline using a cat on a file or an echo with your logline.

$ cat file.txt | /usr/share/logstash/bin/logstash -f pipeline.conf

or

$ echo "your log line" | /usr/share/logstash/bin/logstash -f pipeline.conf

To parse your message you can use one of the parse filters, mainly dissect, grok, json and kv, it really depends on how your message looks like.

Personally I avoid grok as much as possible, while it is very powerful, in most of the cases grok is not needed, dissect can parse the same message using less resources.

The main difference between grok and dissect is that grok uses regex to match the type of the field and dissect is positional, it will match by position without any validation of the type of the data.

If your messages have the same structure, with the fields always in the same position, use dissect. I've improved my logstash performance using dissect instead of grok so many times that I'm a little biased now, I wrote a small comparative about the performance of those filters here.

But again, this depends entirely on how your message looks like, you can even combine the filters, for example, if your log has a plain text part and a json or kv part, you can use dissect to get those parts and parse it with the appopriated filter.

If you share how your message looks like, what you have tried and what is the expected result, someone will probably be able to git you more insight on the best way to achieve the results.

Hey thanks so much for the clarity. Had it been a snake it would have bitten me -- I was over complicating things but what a way to learn! Thanks for the help!

<14>1 2016-12-25T09:03:52.754646-06:00 xxxxxhost1 antivirus 2496 - - alertname="Virus Found" computername="xxxxxxpc42" computerip="123.45.678.910" severity="1"

Well, your log line is pretty simple, you have a plain text part and a kv part, you can use a combination of the dissect filter and the kv filter to parse it.

Considering that the two - - in your message are fields that could be present or not, you can use the following dissect to parse this message.

dissect {
    mapping => {
        "message" => "<%{}>%{} %{timestamp} %{hostname} %{appname} %{thread} %{extra1} %{extra2} %{kvmsg}"
    }
}

The empty %{} will not store any data, the named ones will store the value in that position in a field with the name specified.

This filter will give you the following fields:

{
  "extra1": "-",
  "message": "<14>1 2016-12-25T09:03:52.754646-06:00 xxxxxhost1 antivirus 2496 - - alertname=\"Virus Found\" computername=\"xxxxxxpc42\" computerip=\"123.45.678.910\" severity=\"1\"",
  "kvmsg": "alertname=\"Virus Found\" computername=\"xxxxxxpc42\" computerip=\"123.45.678.910\" severity=\"1\"",
  "appname": "antivirus",
  "host": "weiss",
  "hostname": "xxxxxhost1",
  "@version": "1",
  "@timestamp": "2021-08-28T22:49:15.783Z",
  "extra2": "-",
  "timestamp": "2016-12-25T09:03:52.754646-06:00",
  "thread": "2496"
}

The fields @version, @timestamp and host are created by logstash.

To parse the message in the kvmsg field you just need to use the kv filter.

kv {
    source => "kvmsg"
}

So, in the end your message will have the following fields.

{
  "kvmsg": "alertname=\"Virus Found\" computername=\"xxxxxxpc42\" computerip=\"123.45.678.910\" severity=\"1\"",
  "alertname": "Virus Found",
  "timestamp": "2016-12-25T09:03:52.754646-06:00",
  "host": "weiss",
  "message": "<14>1 2016-12-25T09:03:52.754646-06:00 xxxxxhost1 antivirus 2496 - - alertname=\"Virus Found\" computername=\"xxxxxxpc42\" computerip=\"123.45.678.910\" severity=\"1\"",
  "thread": "2496",
  "computername": "xxxxxxpc42",
  "@timestamp": "2021-08-28T22:52:07.009Z",
  "@version": "1",
  "computerip": "123.45.678.910",
  "severity": "1",
  "extra2": "-",
  "appname": "antivirus",
  "extra1": "-",
  "hostname": "xxxxxhost1"
}

If you do not want any of those fields in your final message, you can use the mutate filter to remove it.

mutate {
    remove_field => [ "fieldname1", "fieldnameN" ]
}

To write this message in a file you just use the file output filter.

output {
    file {
        path => "/path/to/the/output/file.json"
    }
}
1 Like