hi Ruflin,
I spent a lot of time troubleshooting this yesterday and believe I finally got it working.
- Confirmed that the IIS logs were configured for UTF-8
- Confirmed filebeat configuration was also set to UTF-8 encoding
- Confirmed parsing looked good in filebeat debug logs
And here is what I changed that seems to have corrected this:
Logstash config was set to:
input {
tcp { port => 3516 type => IIS }
}
if [type] == "IIS" {
if [message] =~ "^#" {
drop {}
}
grok {
match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{DATA:service-name} %{DATA:hostname} %{IPV4:site} %{DATA:method} %{URIPATH:page} %{NOTSPACE:querystring} %{INT:port} %{NOTSPACE:username} %{IPV4:clienthost} %{NOTSPACE:useragent} %{DATA:referer} %{INT:response} %{INT:subresponse} %{INT:scstatus} %{NUMBER:sc-bytes} %{NUMBER:cs-bytes} %{NUMBER:iis-time-taken}"]
}
date {
match => ["log_timestamp", "YYYY-MM-dd HH:mm:ss"]
}
mutate { convert => ["sc-bytes", "integer"] convert => ["cs-bytes", "integer"] convert => ["iis-time-taken", "integer"] }}
And was changed to:
input {
beats { port => 3516 type => IIS }
}
So this begs the question - why does TCP input affect the encoding differently than BEATS input?