I'm trying to get Logstash to take in an IIS web log and geo locate the IP address using something like GeoLite from MaxMind and then dump it to Elasticsearch. Has anyone done something similar before? I'm not an ELK guru so any help is much appreciated.
I have ELK running, but I can't quite get the syntax in my logstash.conf right in order to get this to work.
I'm trying to keep it simple at the moment, I first want to try and just get my IIS wbelog datat into Elastic. Here is my .conf file
input {
file {
type => "iis-w3c"
path => "C:/logs/*.log"
}
}
filter {
## Ignore the comments that IIS will add to the start of the W3C logs
#
if [message] =~ "^#" {
drop {}
}
grok {
## Very helpful site for building these statements:
# http://grokdebug.herokuapp.com/
#
# This is configured to parse out every field of IIS's W3C format when
# every field is included in the logs
#
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:s-sitename} % {WORD:cs-method} %{URIPATH:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} % {NOTSPACE:cs-username} %{IPORHOST:c-ip} %{NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Cookie)} %{NOTSPACE:cs(Referer)} %{NOTSPACE:cs-host} %{NUMBER:sc-status:int} %{NUMBER:sc- substatus:int} %{NUMBER:sc-win32-status:int} %{NUMBER:sc-bytes:int} %{NUMBER:cs-bytes:int} % {NUMBER:time-taken:int}" ,
"message", "%{TIMESTAMP_ISO8601:timestamp} %{IPORHOST:s-sitename} %{WORD:cs-method} % {URIPATH:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} %{NOTSPACE:cs-username} % {IPORHOST:c-ip} %{NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Referer)} %{NUMBER:response:int} % {NUMBER:sc-substatus:int} %{NUMBER:sc-substatus:int} %{NUMBER:time-taken:int}" ,
"message", "%{TIMESTAMP_ISO8601:timestamp} %{WORD:cs-method} %{URIPATH:cs-uri-stem} % {NOTSPACE:cs-post-data} %{NUMBER:s-port} %{IPORHOST:c-ip} HTTP/%{NUMBER:c-http-version} % {NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Cookie)} %{NOTSPACE:cs(Referer)} %{NOTSPACE:cs- host} %{NUMBER:sc-status:int} %{NUMBER:sc-bytes:int} %{NUMBER:cs-bytes:int} %{NUMBER:time- taken:int}"
]
}
## Set the Event Timesteamp from the log
#
date {
match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "Etc/UTC"
}
## If the log record has a value for 'bytesSent', then add a new field
# to the event that converts it to kilobytes
#
if [bytesSent] {
ruby {
code => "event['kilobytesSent'] = event['bytesSent'].to_i / 1024.0"
}
}
## Do the same conversion for the bytes received value
#
if [bytesReceived] {
ruby {
code => "event['kilobytesReceived'] = event['bytesReceived'].to_i / 1024.0"
}
}
## Perform some mutations on the records to prep them for Elastic
#
mutate {
## Convert some fields from strings to integers
#
convert => ["bytesSent", "integer"]
convert => ["bytesReceived", "integer"]
convert => ["timetaken", "integer"]
## Finally remove the original log_timestamp field since the event will
# have the proper date on it
#
remove_field => [ "log_timestamp"]
}
## Parse out the user agent
#
useragent {
source=> "useragent"
prefix=> "browser"
}
}
## We're only going to output these records to Elasticsearch so configure
# that.
#
output {
elasticsearch {
embedded => false
host => "localhost"
port => 9200
protocol => "http"
#
## Log records into month-based indexes
#
index => "%{type}-%{+YYYY.MM}"
}
## stdout included just for testing
#
# stdout {codec => rubydebug}
}
I don't think it likes my grok though as I get this error when I run it.
[ERROR][logstash.agent ] fetched an invalid config {:config=>"input { \n file {\n type => \"iis-w3c\"\n path => \"C:/logs/*.log\"\n }\n\n}\n\nfilter { \n ## Ignore the comments that IIS will add to the start of the W3C logs\n #\n if [message] =~ \"^#\" {\n drop {}\n }\n\n grok {\n ## Very helpful site for building these statements:\n # http://grokdebug.herokuapp.com/\n #\n # This is configured to parse out every field of IIS's W3C format when\n # every field is included in the logs\n #\n match => [ \"message\", \"%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:s-sitename} %{WORD:cs-method} %{URIPATH:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} %{NOTSPACE:cs-username} %{IPORHOST:c-ip} %{NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Cookie)} %{NOTSPACE:cs(Referer)} % {NOTSPACE:cs-host} %{NUMBER:sc-status:int} %{NUMBER:sc-substatus:int} %{NUMBER:sc-win32-status:int} %{NUMBER:sc-bytes:int} %{NUMBER:cs-bytes:int} %{NUMBER:time-taken:int}\" ,\n\"message\", \"% {TIMESTAMP_ISO8601:timestamp} %{IPORHOST:s-sitename} %{WORD:cs-method} %{URIPATH:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} %{NOTSPACE:cs-username} %{IPORHOST:c-ip} % {NOTSPACE:cs(User-Agent)} %{NOTSPACE:cs(Referer)} %{NUMBER:response:int} %{NUMBER:sc-substatus:int} %{NUMBER:sc-substatus:int} %{NUMBER:time-taken:int}\" ,\n\"message\", \"% {TIMESTAMP_ISO8601:timestamp} %{WORD:cs-method} %{URIPATH:cs-uri-stem} %{NOTSPACE:cs-post-data} %{NUMBER:s-port} %{IPORHOST:c-ip} HTTP/%{NUMBER:c-http-version} %{NOTSPACE:cs(User- Agent)} %{NOTSPACE:cs(Cookie)} %{NOTSPACE:cs(Referer)} %{NOTSPACE:cs-host} %{NUMBER:sc-status:int} %{NUMBER:sc-bytes:int} %{NUMBER:cs-bytes:int} %{NUMBER:time-taken:int}\"\n]\n }\n\n ## Set the Event Timesteamp from the log\n #\n date {\n match => [ \"log_timestamp\", \"YYYY-MM-dd HH:mm:ss\" ]\n timezone => \"Etc/UTC\"\n }\n\n ## If the log record has a value for 'bytesSent', then add a new field\n # to the event that converts it to kilobytes\n #\n if [bytesSent] {\n ruby {\n code => \"event['kilobytesSent'] = event['bytesSent'].to_i / 1024.0\"\n }\n }\n\n\n ## Do the same conversion for the bytes received value\n #\n if [bytesReceived] {\n ruby {\n code => \"event['kilobytesReceived'] = event['bytesReceived'].to_i / 1024.0\"\n }\n }\n\n ## Perform some mutations on the records to prep them for Elastic\n #\n mutate {\n ## Convert some fields from strings to integers\n #\n convert => [\"bytesSent\", \"integer\"]\n convert => [\"bytesReceived\", \"integer\"]\n convert => [\"timetaken\", \"integer\"]\n\n ## Finally remove the original log_timestamp field since the event will\n # have the proper date on it\n #\n remove_field => [ \"log_timestamp\"]\n }\n\n ## Parse out the user agent\n #\n useragent {\n source=> \"useragent\"\n prefix=> \"browser\"\n }\n\n}\n\n## We're only going to output these records to Elasticsearch so configure\n# that.\n#\noutput { \n elasticsearch {\n embedded => false\n host => \"localhost\"\n port => 9200\n protocol => \"http\"\n #\n ## Log records into month-based indexes\n #\n index => \"%{type}-%{+YYYY.MM}\"\n }\n\n ## stdout included just for testing\n #\n # stdout {codec => rubydebug}\n}\n", :reason=>"Something is wrong with your configuration."}
I've modified my conf now a little and am able to get IIS logs into Elastic. I'm not sure what to do next with getting the geolocation based on IP though. From what I understand, Logstash has the MaxMind geo database built into it already? Does anyone have an example how to get this working?
input {
file {
#type => "iis"
path => "C:/logs/*.log"
start_position => "beginning"
}
}
filter {
#ignore log comments
if [message] =~ "^#" {
drop {}
}
grok {
# check that fields match your IIS log settings
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:s-sitename} %{IPORHOST:s-ip} %{WORD:cs-method} %{URIPATH:cs-uri-stem} %{NOTSPACE:cs-uri-query} %{NUMBER:s-port} %{NOTSPACE:cs-username} %{IPORHOST:c-ip} %{WORD:cs-version}"]
}
#Set the Event Timesteamp from the log
date {
match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
timezone => "Etc/UTC"
}
mutate {
remove_field => [ "log_timestamp"]
}
}
# See documentation for different protocols:
# http://logstash.net/docs/1.4.2/outputs/elasticsearch
output {
# stdout { codec => rubydebug }
elasticsearch { hosts => ["localhost:9200"] }
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.