Hello everybody.
**I have a Logstash job loading data from a kafka topic, running for many months without any issue. The only change done on the last 3 weeks was the upgrade to ELK 7.0. Still, after the upgrade the loading was working fine however, after a certain day, it stopped to parse most of the fields for no reason. **
Here's the .conf file:
input {
kafka {
bootstrap_servers => ":9094"
topics => ["proxy"]
codec => "json"
}
}
filter {
grok {
patterns_dir => ["./patterns"]
break_on_match => false
match => { "payload" => "%{SQUID_TIMESTAMP}" }
match => { "payload" => "%{TCP_STATUS}" }
match => { "payload" => "%{TARGET_WEBSITE}" }
match => { "payload" => "%{TARGET_WEBSITE_GET}" }
match => { "payload" => "%{TARGET_WEBSITE_GET2}" }
match => { "payload" => "%{SQUID_BYTES}" }
match => { "payload" => "%{SQUID_IP}" }
match => { "payload" => "%{SQUID_SOURCE_IP}" }
}
date {
match => [ "proxy_timestamp","UNIX" ]
}
mutate {
convert => { "bytes" => "integer" }
}
geoip {
source => "ip_address"
}
translate {
field => "source_ip"
destination => "source"
dictionary => {
"192.168.1.XXX" => "windows-hp"
"192.168.1.XXX" => "linux-1"
"192.168.1.XXX" => "linux-2"
"192.168.1.XXX" => "linux-3"
"192.168.1.XXX" => "laptop"
"192.168.1.XXX" => "windows-tablet"
"192.168.1.XXX" => "linux-master"
}
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => [ "192.168.1.XXX:9200","192.168.1.XXX:9200","192.168.1.XXX:9200" ]
index => "proxy-%{+YYYY.MM}"
}
}
Here's the pattern file:
SQUID_TIMESTAMP ^(?<proxy_timestamp>\d*)
TARGET_WEBSITE CONNECT\s(?[^:])
TARGET_WEBSITE_GET GET htt\w://(?\w*.\w*.\w*)
TARGET_WEBSITE_GET2 GET htt\w*://(?\w*.\w*)
TCP_STATUS TCP_(?<tcp_status>[^/])
SQUID_BYTES TCP_\w/\d*\s(?\d*)
SQUID_SOURCE_IP \d*.\d*\s*\d*\s(?<source_ip>\d*.\d*.\d*.\d*)
SQUID_IP HIER_DIRECT/(?<ip_address>\d*.\d*.\d*.\d*)
And below a sample of the output:
{
"source_ip" => "192.168.1.XXX",
"@version" => "1",
"@timestamp" => 2019-05-10T04:09:09.360Z,
"payload" => "1557461305.894 686929 192.168.1.XXX TCP_TUNNEL/200 194202 CONNECT www.elastic.co:443 - HIER_DIRECT/151.101.130.217 -",
"schema" => {
"type" => "string",
"optional" => false
},
"tags" => [
[0] "_geoip_lookup_failure"
],
"source" => "windows-hp"
}
**Note that the parsing of "source_ip" worked, since it showed the value and properly mapped the device name, but all other regex in the pattern file did not work. There was absolutely no change o any configuration file so I have no idea what can be causing this issue. **
Appreciate any tip or feedback.
Thanks.