My logstash config file is working now, I'm parsing the input log file and extracting the right data, all good here. The event looks good in the RUBYDEBUG from logstash. The issue I'm having is that I started redirecting my extracted events to Elasticsearch and now event structure that I get into ES is different than what I defined it using _Mappings definition. Somehow logstash is overriding the document properties.
Here is what it looks like in logstash console (all good):
{
"message" => "[7/30/15 18:50:59:616 GMT] 00000688 CurrencyExcha I Currency Exchange Rate File Retreival at 2015/07/30 18:50:59 success",
"@version" => "1",
"@timestamp" => "2015-08-06T16:13:38.100Z",
"host" => "IBM-EN189AKEUJ4",
"path" => "C:/logstash-1.5.3/inputs/SystemOut1.log",
"type" => "sysout",
"tslice" => "7/30/15 18:50:59:616",
"status" => "success"
}
When it gets to ES it looks terrible - Logstash stuffs everything into the [_source]{message} array, seemingly ignoring the Mapping definition in ES for index 'sprint'... The doc looks like this in ES:
curl -XGET 'localhost:9200/sprint/sysout/AU8DyhBIErJTJM_bP4rF?pretty'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 471 100 471 0 0 31400 0 --:--:-- --:--:-- --:--:-- 31400{
"_index" : "sprint",
"_type" : "sysout",
"_id" : "AU8DyhBIErJTJM_bP4rF",
"_version" : 1,
"found" : true,
"_source":{"message":"[7/30/15 18:50:59:616 GMT] 00000688 CurrencyExcha I Currency Exchange Rate File Retreival at 2015/07/30 18:50:59 success","@version":"1","@timestamp":"2015-08-06T16:13:38.100Z","host":"IBM-EN189AKEUJ4","path":"C:/logstash-1.5.3/inputs/SystemOut1.log","type":"sysout","tslice":"7/30/15 18:50:59:616","status":"success"}
}
Here is my _Mapping that I've created for 'sprint' index are as follows:
curl -XGET "localhost:9200/sprint/_mappings?pretty"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 760 100 760 0 0 47500 0 --:--:-- --:--:-- --:--:-- 47500{
"sprint" : {
"mappings" : {
"sysout" : {
"properties" : {
"@timestamp" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"@version" : {
"type" : "string"
},
"host" : {
"type" : "string"
},
"message" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"tslice" : {
"type" : "date",
"format" : "MM/dd/yy HH:mm:ss:SSS"
},
"type" : {
"type" : "string"
}
}
}
}
}
}
I should also paste the current config file as it exists right now:
input {
# Log file
file {
path => "C:/logstash-1.5.3/inputs/SystemOut1.log"
type => "sysout"
#start_position => "beginning"
}
# Standard input file
#stdin { }
}
filter {
grok {
match => [
"message",
"^\[%{DATESTAMP:tslice} GMT\] %{GREEDYDATA} (%{WORD:status}(\.)?)$"
]
}
if "_grokparsefailure" in [tags] {
drop { }
}
}
output {
elasticsearch {
host => localhost
index => "sprint"
}
stdout {
codec => rubydebug
}
}
Why can't I get the individual fields in sprint/sysout to get populatedproperly? All I really care about in that Document are fields tslice and status...