Filebeat causes error messages in logstash

Hello,

In this thread Badger points out that my supposed problem with logstash is actually a problem with filebeats. Unfortunately he doesn't use filebeat and therefore can't help me with my problem. It's probably best if you read the original thread but summarized my problem is that filebeat sends unwanted agent info which cause errors in logstash. What follows is an example of the data sent and of the error message:

"log" => {
"offset" => 36099,
"file" => {
"path" => "/var/log/nginx/access_corporate-lounge.de.log"
}
},
"tags" => [
[0] "beats_input_codec_plain_applied",
[1] "nginx-geoip"
],
"clientip" => "89.XXX",
"request" => "/",
"message" => "89.XXX - - [26/Feb/2020:13:03:46 +0100] "GET / HTTP/1.1" 301 178 "-" "check_http/v2.2 (monitoring-plugins 2.2)"",
"httpversion" => "1.1",
"ident" => "-",
"referrer" => ""-"",
"auth" => "-",
"verb" => "GET",
"@timestamp" => 2020-02-26T12:03:46.000Z,
"response" => 301,
"ecs" => {
"version" => "1.1.0"
},
"agent" => {
"id" => "b79a760d-b445-430b-86eb-c27229ebea56",
"ephemeral_id" => "32cda409-1a33-4864-a478-9c83110f45ce",
"version" => "7.5.2",
"hostname" => "xxxx",
"type" => "filebeat"
},
"@version" => "1",
"host" => {
"containerized" => false,
"os" => {
"family" => "debian",
"version" => "9 (stretch)",
"codename" => "stretch",
"kernel" => "4.9.0-8-amd64",
"platform" => "debian",
"name" => "Debian GNU/Linux"
},
"hostname" => "xxxxxx",
"id" => "522a68580a704f4b85b17ef9c7e870a7",
"architecture" => "x86_64",
"name" => "xxxxx"
},
"bytes" => 178,
"geoip" => {
"region_code" => "NH",
"ip" => "89.XXX",
"timezone" => "Europe/Amsterdam",
"country_code2" => "NL",
"city_name" => "Schellinkhout",
"latitude" => 52.6371,
"country_name" => "Netherlands",
"country_code3" => "NL",
"postal_code" => "1697",
"continent_code" => "EU",
"region_name" => "North Holland",
"longitude" => 5.1224,
"location" => {
"lat" => 52.6371,
"lon" => 5.1224
}
},
"input" => {
"type" => "log"

Error message:

[ERROR] 2020-02-26 13:08:45.590 [[main]>worker1] useragent - Uknown error while parsing user agent data {:exception=>#<TypeError: cannot convert instance of class org.jruby.RubyHash to class java.lang.String>, :field=>"agent", :event=>#LogStash::Event:0x18ece00f}

How can I fix this?

Yours faithfully
Stefan

Hey @stefan_schumacher,

Filebeat is using the agent field as recommended by ECS to contain information about the agent collecting the documents. Not the user agent, that according to ECS should be stored in user_agent.original.

The COMBINEDAPACHELOG grok pattern used in Logstash tries to store the user pattern in agent, but I guess that it doesn't do it because this field already contains an object.

Then the Useragent plugin fails because it expects the user agent information to be in the agent field, according to your configuration:

useragent {
    source => "agent"
} 

You could try to replace the COMBINEDAPACHELOG pattern with a custom pattern that replaces agent with user_agent.original (something like this, not tested):

grok {
    match => [ "message" , "%{NGINXLOG}+%{GREEDYDATA:extra_fields}"]
    overwrite => [ "message" ]
    pattern_definitions => {
        NGINXLOG => "%{COMMONAPACHELOG} %{QS:referrer} %{QS:user_agent.original}"
    }
}

And then change your useragent plugin configuration to parse the ECS field:

useragent {
    source => "user_agent.original"
    target => "user_agent"
} 

But, there may be an easier option if you are only collecting logs from Nginx. Filebeat includes modules that can be used to collect logs from common services, these modules already include the parsing pipelines for these services, and there is an nginx module. These modules use Elasticsearch ingest nodes for parsing, so when using them Elasticsearch can be used directly as output for Filebeat, and Logstash is not needed for parsing.

I would recommend you to use Filebeat modules instead of logstash for your use case.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.