Logstash Pipelines for Parsing Questions - No fileset in output

mgotechlock · April 17, 2020, 1:11pm

I've tried following https://www.elastic.co/guide/en/logstash/7.6/logstash-config-for-filebeat-modules.html, and create logstash pipelines to parse the filebeat data.

However, even the examples provided in that link don't seem to apply to me. The example config begins with
filter {
if [fileset][module] == "system" {
if [fileset][name] == "auth" {
Well, when I look at my filebeat logs and just dump them direct to stdout or a file, there is no "fileset.module", "fileset.name" or even "fileset" anywhere in the logs, so their example parsing config never matches. I don't understand if their recommended config is wrong or if there is something I need to do still to get my filebeat output to have "fileset" values. On the filebeat side, I have the system module enabled and the output going to logstash (on a custom port not 5044). Other than that, it is the default install of filebeat.

Do you have any ideas on this?
Thanks for any help you can give.

shaunak · April 17, 2020, 5:48pm

Hi @mgotechlock, could you post here a couple of the filebeat events you dumped to stdout/file? Please make sure to redact any sensitive information in the events before posting.

Thanks,

Shaunak

mgotechlock · April 17, 2020, 6:11pm

{
         "input" => {
        "type" => "log"
    },
         "cloud" => {
        "instance" => {
            "id" => "156963326"
        },
        "provider" => "digitalocean",
          "region" => "nyc3"
    },
      "@version" => "1",
    "@timestamp" => 2020-04-17T18:08:02.238Z,
          "host" => {
                 "name" => "oompaloompa",
         "architecture" => "x86_64",
             "hostname" => "oompaloompa",
                   "os" => {
                "name" => "Ubuntu",
            "codename" => "bionic",
            "platform" => "ubuntu",
              "family" => "debian",
              "kernel" => "4.15.0-96-generic",
             "version" => "18.04.4 LTS (Bionic Beaver)"
        },
        "containerized" => false,
                   "id" => "fc61c6cc61c1434fbf7d14b4fbff55f6"
    },
           "ecs" => {
        "version" => "1.4.0"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
       "message" => "Apr 14 21:24:08 oompaloompa sshd[6934]: Invalid user redis1 from 203.159.249.215 port 58702",
         "agent" => {
        "ephemeral_id" => "d9f4cc34-072e-4b08-bd19-247c9114c9e5",
             "version" => "7.6.2",
                "type" => "filebeat",
            "hostname" => "oompaloompa",
                  "id" => "469f2965-149e-4064-9e0c-2cd0669728b5"
    },
           "log" => {
          "file" => {
            "path" => "/var/log/auth.log"
        },
        "offset" => 943508
    }
}

mgotechlock · April 17, 2020, 6:12pm

         "input" => {
        "type" => "log"
    },
         "cloud" => {
        "instance" => {
            "id" => "156963326"
        },
          "region" => "nyc3",
        "provider" => "digitalocean"
    },
      "@version" => "1",
    "@timestamp" => 2020-04-17T18:08:02.238Z,
          "host" => {
                 "name" => "oompaloompa",
         "architecture" => "x86_64",
             "hostname" => "oompaloompa",
                   "os" => {
                "name" => "Ubuntu",
            "codename" => "bionic",
              "family" => "debian",
            "platform" => "ubuntu",
              "kernel" => "4.15.0-96-generic",
             "version" => "18.04.4 LTS (Bionic Beaver)"
        },
        "containerized" => false,
                   "id" => "fc61c6cc61c1434fbf7d14b4fbff55f6"
    },
           "ecs" => {
        "version" => "1.4.0"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied"
    ],
       "message" => "Apr 14 21:24:08 oompaloompa sshd[6934]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=203.159.249.215",
         "agent" => {
        "ephemeral_id" => "d9f4cc34-072e-4b08-bd19-247c9114c9e5",
             "version" => "7.6.2",
                "type" => "filebeat",
            "hostname" => "oompaloompa",
                  "id" => "469f2965-149e-4064-9e0c-2cd0669728b5"
    },
           "log" => {
        "offset" => 943686,
          "file" => {
            "path" => "/var/log/auth.log"
        }
    }
}

shaunak · April 17, 2020, 6:34pm

Hmm, that's strange. I just tried to reproduce this with Filebeat 7.6.0 with the system module enabled and I'm seeing an event field in the events, which contains sub-fields like module and dataset.

Could you please post the result of the following command (again, after redacting any sensitive information)?

filebeat export config

Thanks,

Shaunak

mgotechlock · April 17, 2020, 11:22pm

So I made progress. Apparently, you are not supposed to have inputs enabled in filebeat.yml AND the modules enabled. Who knew? The documentation is terrible. Once I disabled the inputs from filebeat.yml, I see data in a better format, but still insufficient to meet the config Elastic publishes in the original URL.
I do see fileset.name=auth and event.module=system, but the example config is fileset.module=system. I can easily change the config but i would like confirmation that the example config Elastic publishes is incorrect, so I can be sure I am not doing anything wrong.

shaunak · April 17, 2020, 11:58pm

It's possible to mix "raw" Filebeat inputs in your configuration with modules. Imagine a case where you have logs from a well-known service like Apache or system logs but also have logs from your own application. You could ingest all of these logs with a single Filebeat instance by enabling the apache and system modules but also specifying your own "raw" inputs in the Filebeat configuration.

BTW, modules start their own inputs under the hood so, at the end of the day, there are inputs configured anyway!

You're right — the configurations shown in that documentation are outdated. So sorry about that and thank you for bringing it to our attention. I've created a PR now to fix that documentation: Updating fields to new ECS names by ycombinator · Pull Request #11807 · elastic/logstash · GitHub.

As I've done in the PR, I'd suggest using [event][module] in place of [fileset][module] and [event][dataset] instead of [fileset][name]. Note that the event.dataset field contains the fully-qualified name of the dataset (aka fileset), so it includes the module name as a prefix, e.g. system.auth.

mgotechlock · April 20, 2020, 11:13am

Awesome. Thanks for your assist. I understand it now. If i could ask one more question.
Just taking the [auth] from system pipeline, I've enabled it and are getting logs but some are not being parsed properly, while some are. I believe it is because the unparsed ones do not match one of the 7 "match" statements in the example config. I can obviously add more but I just wanted confirmation that that is to be expected and that I should not expect the example config to catch every possible entry? And if that is true, any idea how many "match" statements I might end up needing to create ?

mgotechlock · April 20, 2020, 11:30am

FYI, if you do change the example to event.dataset, change the value to system.auth.
My logs are still showing fileset.name = auth works, so I think either is fine, though I admittedly only looking at ubuntu at the moment.

shaunak · April 20, 2020, 11:45am

Would you mind creating a new topic for this, since this topic here is already marked as solved? It just keeps the forums clean and easily searchable for anyone else running into similar problems. Thanks!

shaunak · April 20, 2020, 11:48am

Indeed, that's what I meant by:

Yes definitely, either can work, at least for now. The reason I prefer to use event.dataset is that it's a core field in ECS and, as such, more future proof.

system · May 18, 2020, 11:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat not sending fileset meta data to logstash Beats filebeat	3	285	October 28, 2019
Logstash pipelines: "if [fileset][module]" sometimes not working properly Logstash	1	1484	January 12, 2019
Filebeat modules with Logstash Beats filebeat	4	1082	October 1, 2019
Filebeat's module using Logstash Beats filebeat	6	3710	November 30, 2017
Filebeat Cisco Module to Logstash not working as expected Beats beats-module , filebeat	5	629	November 25, 2020

Logstash Pipelines for Parsing Questions - No fileset in output

Related topics