On my Docker servers I use the GELF log plugin to write the logs in GELF format to Logstash.
On the Logstash side I prepared the following listener:
input {
gelf {
host => "0.0.0.0"
port => 5000
type => "docker"
}
}
The messages, sent to stdout of the container, are sent by the Docker daemon to Logstash's gelf listener. Sometimes the stdout logs of a container can look like this:
10.42.23.37 - - [29/Sep/2017 13:11:55] "GET / HTTP/1.1" 200 -
Sometimes, depending on the application, the logs sent to stdout are in json format, for example:
{"pid":10,"hostname":"b4cf4da50843","level":20,"time":1506690729316,"msg":"Successfully loaded schema 'LDDesign'","v":1}
Of course, when they arrive in json format, I'd like to split up the message into fields (pid, hostname, level, msg, v).
This does work if I apply the following filter:
filter {
if [type] == "docker" {
json { source => "message" }
}
}
However this now tries to "json-format" every message. Which returns a lot of errors on non-json messages like this:
[2017-09-29T15:16:55,899][WARN ][logstash.filters.json ] Error parsing json {:source=>"message", :raw=>"10.42.159.41 - - [29/Sep/2017 13:16:55] "GET / HTTP/1.1" 200 -\r", :exception=>#<LogStash::Json::ParserError: Unexpected character ('.' (code 46)): Expected space separating root-level values
I came across Unable to avoid JSON parsing errors in logstash log-file which suggests a conditional check "to see if the mssage field looks like json". But the suggested if condition doesn't seem to correctly detect json in my case. I tried:
filter {
if [type] == "docker" {
if [message] =~ "\A\{.+\}\z" {
json { source => "message" }
}
}
}
But now nothing is detected as json and the json-messages are not split into fields.
Is there a general and known way to achieve this; detect json regular expression?