Squid parsing stopped to parse all fields

Hello everybody.

**I have a Logstash job loading data from a kafka topic, running for many months without any issue. The only change done on the last 3 weeks was the upgrade to ELK 7.0. Still, after the upgrade the loading was working fine however, after a certain day, it stopped to parse most of the fields for no reason. **

Here's the .conf file:

input {
kafka {
bootstrap_servers => ":9094"
topics => ["proxy"]
codec => "json"
}
}

filter {
grok {
patterns_dir => ["./patterns"]
break_on_match => false
match => { "payload" => "%{SQUID_TIMESTAMP}" }
match => { "payload" => "%{TCP_STATUS}" }
match => { "payload" => "%{TARGET_WEBSITE}" }
match => { "payload" => "%{TARGET_WEBSITE_GET}" }
match => { "payload" => "%{TARGET_WEBSITE_GET2}" }
match => { "payload" => "%{SQUID_BYTES}" }
match => { "payload" => "%{SQUID_IP}" }
match => { "payload" => "%{SQUID_SOURCE_IP}" }
}

date {
match => [ "proxy_timestamp","UNIX" ]
}

mutate {
convert => { "bytes" => "integer" }
}

geoip {
source => "ip_address"
}

translate {
field => "source_ip"
destination => "source"
dictionary => {
"192.168.1.XXX" => "windows-hp"
"192.168.1.XXX" => "linux-1"
"192.168.1.XXX" => "linux-2"
"192.168.1.XXX" => "linux-3"
"192.168.1.XXX" => "laptop"
"192.168.1.XXX" => "windows-tablet"
"192.168.1.XXX" => "linux-master"
}
}

}

output {
stdout {
codec => rubydebug
}

elasticsearch {
hosts => [ "192.168.1.XXX:9200","192.168.1.XXX:9200","192.168.1.XXX:9200" ]
index => "proxy-%{+YYYY.MM}"
}
}

Here's the pattern file:

SQUID_TIMESTAMP ^(?<proxy_timestamp>\d*)
TARGET_WEBSITE CONNECT\s(?[^:])
TARGET_WEBSITE_GET GET htt\w
://(?\w*.\w*.\w*)
TARGET_WEBSITE_GET2 GET htt\w*://(?\w*.\w*)
TCP_STATUS TCP_(?<tcp_status>[^/])
SQUID_BYTES TCP_\w
/\d*\s(?\d*)
SQUID_SOURCE_IP \d*.\d*\s*\d*\s(?<source_ip>\d*.\d*.\d*.\d*)
SQUID_IP HIER_DIRECT/(?<ip_address>\d*.\d*.\d*.\d*)

And below a sample of the output:

{
"source_ip" => "192.168.1.XXX",
"@version" => "1",
"@timestamp" => 2019-05-10T04:09:09.360Z,
"payload" => "1557461305.894 686929 192.168.1.XXX TCP_TUNNEL/200 194202 CONNECT www.elastic.co:443 - HIER_DIRECT/151.101.130.217 -",
"schema" => {
"type" => "string",
"optional" => false
},
"tags" => [
[0] "_geoip_lookup_failure"
],
"source" => "windows-hp"
}

**Note that the parsing of "source_ip" worked, since it showed the value and properly mapped the device name, but all other regex in the pattern file did not work. There was absolutely no change o any configuration file so I have no idea what can be causing this issue. **

Appreciate any tip or feedback.

Thanks.

Just made it worked. I basically changed the grok filter to the following:

filter {
grok {
patterns_dir => ["./patterns"]
break_on_match => false
match => { "payload" => [
"%{SQUID_SOURCE_IP}",
"%{SQUID_IP}",
"%{TCP_STATUS}",
"%{TARGET_WEBSITE}",
"%{TARGET_WEBSITE_GET}",
"%{TARGET_WEBSITE_GET2}",
"%{SQUID_BYTES}"
] }
}

It's cleaner and make more sense, although I still couldn't figure out why the previous one stopped to work - maybe due to version.

Anyway, I'm keeping this post in case somebody bumps on the same issue.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.