IIS logging and _grokparsefailure tag

Hi,

I want to log our Microsoft's IIS logs but getting _grokparsefailure in tag, though I can see the messages in Kibana but not sure why this tag is being appeared. Here is my NXLOG conf:

define ROOT C:\Program Files (x86)\nxlog

Moduledir %ROOT%\modules
CacheDir %ROOT%\data
Pidfile %ROOT%\data\nxlog.pid
SpoolDir %ROOT%\data
LogFile %ROOT%\data\nxlog.log

Module xm_syslog

#

Module xm_json

#

#

Module xm_csv

Fields $date, $time, $s-ip, $cs-method, $cs-uri-stem, $cs-uri-query, $s-port, $cs-username, $c-ip, $cs(User-Agent), $sc-status, $sc-substatus, $sc-win32-status, $time-taken

FieldTypes string, string, string, string, string, string, integer, string, string, string, integer, integer, integer, integer

Delimiter ' '

#

define BASEDIR C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\MessageTracking

Module im_file File "C:\inetpub\logs\LogFiles\W3SVC1\u_ex*.log" ReadFromLast True SavePos True Exec if $raw_event =~ /^#/ drop(); # else \ # { \ # w3c->parse_csv(); \ # $EventTime = parsedate($date + " " + $time); \ # $SourceName = "excahnge_V1_iis_log"; \ # $Message = to_json(); \ } Module om_tcp Host 192.195.88.35 Port 3516 Exec $SourceName = 'excahnge_V1_iis_log'; OutputType LineBased

<Route 1>
Path iis_exchange_v1 => out_iis_exchange_v1

and this is the Logstash:

#tcp Exchange IIS logs stream via 3516

tcp {
type => "iis_exch_v1"
port => 3516
#host => "192.195.88.113"
}
} # end input

filter {

if [type] == "iis_exch_v1" {
csv {add_tag => ['exh_v1_iis']}
grok {
match => ['@message', "%{TIMESTAMP_ISO8601:timestamp} %{IPORHOST:hostip} %{WORD:method} %{URIPATH:page} %{NOTSPACE:query} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientip} %{NOTSPACE:useragent} %{NOTSPACE:referrer} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:scstatus} %{NUMBER:timetaken}"]
}
#if "_grokparsefailure" in [tags] {
# drop { }
# }
}
}

output {
elasticsearch { hosts => ["192.195.88.35:9200"]}
#protocol => "http"
stdout { codec => rubydebug}
}# end output

Could you please help me in this regards if I am doing right?

The _grokparsefailure tag indicates a problem with your grok expression. Are you sure you should be matching the expression against the @message field rather than message? Please show the output of the stdout output so that we can see what the events look like.

This is what you need:

@timestamp March 29th 2016, 09:40:28.033
t@version 1
t_id AVPBGiO3tOFqHDxvrSdW
t_index logstash-2016.03.29
#_score
t_type iis
?column1 2016-03-29 06:40:27 192.195.88.113 POST /EWS/Exchange.asmx - 443 QATAR\eric.wilson 192.195.88.115 MacOutlook/0.0.0.160212+(Intel+Mac+OS+X+Version+10.11.4+(Build+15E65)) 200 0 0 0
thost 192.195.88.113
tmessage 2016-03-29 06:40:27 192.195.88.113 POST /EWS/Exchange.asmx - 443 QATAR\eric.wilson 192.195.88.115 MacOutlook/0.0.0.160212+(Intel+Mac+OS+X+Version+10.11.4+(Build+15E65)) 200 0 0 0
?port 42610
ttags exh_v1_iis, _grokparsefailure
ttype iis

No, that's copy/paste from Kibana and nearly unreadable. Please show the output of the stdout output. If you run Logstash as a service it should end up in your log file. When you've verified that things work you should comment out that output to you won't run out of disk space.

Can I attach the log file? Here is the excerpt from it:

{:timestamp=>"2016-03-29T09:47:58.842000+0300", :message=>"Using mapping template from", :path=>nil, :level=>:info}
{:timestamp=>"2016-03-29T09:47:59.061000+0300", :message=>"Attempting to install template", :manage_template=>{"template"=>"logstash-", "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"default"=>{"_all"=>{"enabled"=>true, "omit_norms"=>true}, "dynamic_templates"=>[{"message_field"=>{"match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}}}}, {"string_fields"=>{"match"=>"", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}, "fields"=>{"raw"=>{"type"=>"string", "index"=>"not_analyzed", "doc_values"=>true, "ignore_above"=>256}}}}}, {"float_fields"=>{"match"=>"", "match_mapping_type"=>"float", "mapping"=>{"type"=>"float", "doc_values"=>true}}}, {"double_fields"=>{"match"=>"", "match_mapping_type"=>"double", "mapping"=>{"type"=>"double", "doc_values"=>true}}}, {"byte_fields"=>{"match"=>"", "match_mapping_type"=>"byte", "mapping"=>{"type"=>"byte", "doc_values"=>true}}}, {"short_fields"=>{"match"=>"", "match_mapping_type"=>"short", "mapping"=>{"type"=>"short", "doc_values"=>true}}}, {"integer_fields"=>{"match"=>"", "match_mapping_type"=>"integer", "mapping"=>{"type"=>"integer", "doc_values"=>true}}}, {"long_fields"=>{"match"=>"", "match_mapping_type"=>"long", "mapping"=>{"type"=>"long", "doc_values"=>true}}}, {"date_fields"=>{"match"=>"", "match_mapping_type"=>"date", "mapping"=>{"type"=>"date", "doc_values"=>true}}}, {"geo_point_fields"=>{"match"=>"", "match_mapping_type"=>"geo_point", "mapping"=>{"type"=>"geo_point", "doc_values"=>true}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "doc_values"=>true}, "@version"=>{"type"=>"string", "index"=>"not_analyzed", "doc_values"=>true}, "geoip"=>{"type"=>"object", "dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip", "doc_values"=>true}, "location"=>{"type"=>"geo_point", "doc_values"=>true}, "latitude"=>{"type"=>"float", "doc_values"=>true}, "longitude"=>{"type"=>"float", "doc_values"=>true}}}}}}}, :level=>:info}
{:timestamp=>"2016-03-29T09:47:59.264000+0300", :message=>"New Elasticsearch output", :class=>"LogStash::Outputs::ElasticSearch", :hosts=>["192.195.88.35:9200"], :level=>:info}
{:timestamp=>"2016-03-29T09:47:59.280000+0300", :message=>"Using mapping template from", :path=>nil, :level=>:info}
{:timestamp=>"2016-03-29T09:47:59.280000+0300", :message=>"Attempting to install template", :manage_template=>{"template"=>"logstash-", "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"default"=>{"_all"=>{"enabled"=>true, "omit_norms"=>true}, "dynamic_templates"=>[{"message_field"=>{"match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}}}}, {"string_fields"=>{"match"=>"", "match_mapping_type"=>"string", "mapping"=>{"type"=>"string", "index"=>"analyzed", "omit_norms"=>true, "fielddata"=>{"format"=>"disabled"}, "fields"=>{"raw"=>{"type"=>"string", "index"=>"not_analyzed", "doc_values"=>true, "ignore_above"=>256}}}}}, {"float_fields"=>{"match"=>"", "match_mapping_type"=>"float", "mapping"=>{"type"=>"float", "doc_values"=>true}}}, {"double_fields"=>{"match"=>"", "match_mapping_type"=>"double", "mapping"=>{"type"=>"double", "doc_values"=>true}}}, {"byte_fields"=>{"match"=>"", "match_mapping_type"=>"byte", "mapping"=>{"type"=>"byte", "doc_values"=>true}}}, {"short_fields"=>{"match"=>"", "match_mapping_type"=>"short", "mapping"=>{"type"=>"short", "doc_values"=>true}}}, {"integer_fields"=>{"match"=>"", "match_mapping_type"=>"integer", "mapping"=>{"type"=>"integer", "doc_values"=>true}}}, {"long_fields"=>{"match"=>"", "match_mapping_type"=>"long", "mapping"=>{"type"=>"long", "doc_values"=>true}}}, {"date_fields"=>{"match"=>"", "match_mapping_type"=>"date", "mapping"=>{"type"=>"date", "doc_values"=>true}}}, {"geo_point_fields"=>{"match"=>"", "match_mapping_type"=>"geo_point", "mapping"=>{"type"=>"geo_point", "doc_values"=>true}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "doc_values"=>true}, "@version"=>{"type"=>"string", "index"=>"not_analyzed", "doc_values"=>true}, "geoip"=>{"type"=>"object", "dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip", "doc_values"=>true}, "location"=>{"type"=>"geo_point", "doc_values"=>true}, "latitude"=>{"type"=>"float", "doc_values"=>true}, "longitude"=>{"type"=>"float", "doc_values"=>true}}}}}}}, :level=>:info}

No, that's not it. It should look similar to this:

{
       "message" => "Tue Mar 29 09:25:00 CEST 2016",
      "@version" => "1",
    "@timestamp" => "2016-03-29T07:25:07.847Z",
          "host" => "lnxolofon"
}

Come to think of it, maybe it ends up in logstash.stdout rather than logstash.log.

Thanks for being helpful Magnus.

Should I use stdout { codec => rubydebug } or stdout { codec => json } as output?

I am running Logstash on Windows Box. I can't find a way to produce stdout output other than log file. Could you please help me in this regard?

Should I use stdout { codec => rubydebug } or stdout { codec => json } as output?

I think the former gives more readable output.

The $Message = to_json(); command in your NXLog is supposed to split up the Message field into separate JSON fields. So by the time it hits Logstash, your data should already be split up.

Your GROK expression looks like you are trying to re-split up the message field again. Why is that? At this point it was already split up in NXLog, so you shouldn't need to re-do it again with Grok, you will just end up with duplicate fields.
Try commenting out the entire Grok filter, and change your output to file or stdout to confirm what it currently looks like. You may not need the Grok expression at all.

Thanks bhatch...

Is there any problem with this module, because it says ERROR invalid keyword: else in log:

Module im_file File "C:\inetpub\logs\LogFiles\W3SVC1\u_ex*.log" ReadFromLast True SavePos True
Exec if $raw_event =~ /^#/ drop();	\		
   else\
   {\
        w3c->parse_csv();\
        $SourceName = "exch_v1_iis_log";\
        $Message = to_json();\
  }

Alright, I fixed it. I had to remove extra spaces in front of back slashes, now if I comment out the whole grok filter, can I still use mutate, date and useragent functions?

You should be able to. Both the NXLOG parser and your Grok expression give the same end result, it's just a matter of determining which application you want doing it. You can do the CSV to JSON conversion in Logstash by leaving in the Grok expression in and taking it out of NXLog, but we found it faster to have 100 IIS machines do the work in NXLOG than have a handful of Logstash machines do it later down the line. If you were using Logstash File input to read the IISLogs directly, then you would want to use GROK (Or maybe KV) to do the work.

We let NXLog do the initial CSV to JSON conversion, and then use various filters such as translate, mutate, useragent, geoip, and grok expressions in Logstash to get the data formatted exactly how we want it.

One last thing, the fields names are set in NXLog. It looks like you used a different naming convention from NXLog to Grok. For example, in NXLog you have s-ip, but Grok says hostip. Keep that in mind when creating any further filters.

Hi Bhatch,

Thanks a lot for your reply in detail. We, Windows users who are trying to use UNIX based solutions, need this kind of help and guidance. Keep it up.

I can see around 12 weird fields in Kibana named like Column1, Column2, Column12, etc. Why these fields are showing up like this?

I can see around 12 weird fields in Kibana named like Column1, Column2, Column12, etc. Why these fields are showing up like this?

Because you use a csv filter without naming the columns?

Hi:

This is my nxlog.conf:

<Extension w3c>
    Module xm_csv
    Fields $date, $time, $s-ip, $cs-method, $cs-uri-stem, $cs-uri-query, $s-port, $cs-username, $c-ip, $cs(User-Agent), $cs-Referer, $sc-status, $sc-substatus, $sc-win32-status, $time-taken
    FieldTypes string, string, string, string, string, string, integer, string, string, string, string, integer, integer, integer, integer
    Delimiter ' '
    QuoteChar '"'
    EscapeControl FALSE
    UndefValue -
</Extension>

<Input iis_exchange_v2>  
    Module    im_file
    File    "C:\inetpub\logs\LogFiles\W3SVC1\u_ex*.log"
    ReadFromLast True
    SavePos True

    Exec if $raw_event =~ /^#/ drop();\
	else\
	{\
	w3c->parse_csv();\
	$SourceName = "exch_v2_iis_log";\
	$Message = to_json();\
	}

How and where should I name the columns? Sorry for newbie question.

I was assuming that it was your csv filter in Logstash that was doing the filtering. If the CSV parsing happens on the NXLog side I can't help.

Thanks it's fixed. Though, I was using only this csv {add_tag => ['exh_iis_log']}. As soon as I commented it, columns disappeared.