Hey all, I'm using logstash 2.3 to pull in IIS logs into our little ES cluster running on site. Things are going well (albeit slowly) and now I can start to troll through some of our data I'm seeing a LOT of results from the user-agent filter return as either Other or Generic Smartphone (iPhone and iPad are in the lead).
Second issue is that a lot of the log entries are like this:
2015-12-30 00:00:04 10.131.23.197 POST /handheld/resource/jobsheet/index.cfm - 443 - 1.129.96.46 Mozilla/5.0+(iPhone;+CPU+iPhone+OS+8_1_2+like+Mac+OS+X)+AppleWebKit/600.1.4+(KHTML,+like+Gecko)+Version/8.0+Mobile/12B440+Safari/600.1.4 200 0 0 36419 237
The user-agent string is (for whatever reason) replacing the standard spaces with "+" and it's this (I believe) that's causing the inaccurate matches.
How can we get the logstash-filter-useragent updated for both; latest regexes file AND support "+" as space?
It is recommended that you put these into a new file and not edit the default file. Otherwise it might get overwritten when upgrading plugins. We also kept the original value too as the user agent from Apache uses a space.
If someone has a better solution I would love to hear it. This means we can't automatically upgrade to a newer version of the file when new OS's come out.
I came across this issue and used the following in my filter to remove the "+" and add a space. If you improve on this or have other suggestions for IIS logs please share.
mutate {
gsub => [
# replace + with a space " "
"useragent", "[+]", " "
]
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.