Logstash multiline fails to detect TIMESTAMP and puts the whole file into a single doc

I am using a docker container to spin and push logs up (not a huge quantity) from my PHP app where sometimes lines are broken down into multiple lines by the framework. Example

    2017/05/21 06:02:05 [trace] [system.caching.CFileCache] Serving "Yii.CUrlManager.rules" from cache
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/index.php (13)
    2017/05/21 06:02:05 [trace] [application] Instance instance: OQn25vMyzo5XgX9A3tu6WBy-nHoPUxjly0SR0A0KEEs.eyJpbnN0YW5jZUlkIjoiMTRiYTY0YjgtN2QzMS04YTNjLWZhNTMtODJmN2JiNzY5N2MyIiwic2lnbkRhdGUiOiIyMDE3LTA1LTIxVDA2OjAyOjAyLjE1NVoiLCJ1aWQiOm51bGwsImlwQW5kUG9ydCI6IjczLjI1Mi4yMDUuMTM4LzM3NjQwIiwidmVuZG9yUHJvZHVjdElkIjoiMSIsImRlbW9Nb2RlIjpmYWxzZSwiYWlkIjoiZTc4NGQ1MDYtYWYxZS00OWI0LTkzMDQtN2M4ZGQxZDI3ZjJmIiwic2l0ZU93bmVySWQiOiI5MTI0ZDQ0YS1kZmY5LTRiNDEtYjZjZi0wZWEzMmE5MjAxYjkifQ code:  OQn25vMyzo5XgX9A3tu6WBy-nHoPUxjly0SR0A0KEEs data: eyJpbnN0YW5jZUlkIjoiMTRiYTY0YjgtN2QzMS04YTNjLWZhNTMtODJmN2JiNzY5N2MyIiwic2lnbkRhdGUiOiIyMDE3LTA1LTIxVDA2OjAyOjAyLjE1NVoiLCJ1aWQiOm51bGwsImlwQW5kUG9ydCI6IjczLjI1Mi4yMDUuMTM4LzM3NjQwIiwidmVuZG9yUHJvZHVjdElkIjoiMSIsImRlbW9Nb2RlIjpmYWxzZSwiYWlkIjoiZTc4NGQ1MDYtYWYxZS00OWI0LTkzMDQtN2M4ZGQxZDI3ZjJmIiwic2l0ZU93bmVySWQiOiI5MTI0ZDQ0YS1kZmY5LTRiNDEtYjZjZi0wZWEzMmE5MjAxYjkifQ
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AuthenticationResolver.php (67)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (19)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (33)
    2017/05/21 06:02:05 [trace] [application] Instance code decoded:  9	öæó2ΎW@ÞÛºXŸœzSåËD‘Ð
    K data hashed: 9	öæó2ΎW@ÞÛºXŸœzSåËD‘Ð
    K
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AuthenticationResolver.php (69)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (19)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (33)
    2017/05/21 06:02:05 [trace] [application] Instance is:  Instance::__set_state(array(
       'params' => 
      array (
        'instanceId' => '14ba64b8-7d31-8a3c-fa53-82f7bb7697c2',
        'signDate' => '2017-05-21T06:02:02.155Z',
        'uid' => NULL,
        'ipAndPort' => '73.252.205.138/37640',
        'vendorProductId' => '1',
        'demoMode' => false,
        'aid' => 'e784d506-af1e-49b4-9304-7c8dd1d27f2f',
        'siteOwnerId' => 'xxx',

        'instance' => 'RET£$%^£ERGDFGDFGDERYERTY-nHoPUxjly0SR0A0KEEs.sdfgsdfghsgdfgsjczLjI1Mi4ysdfgsdfgsdfgsMDQtN2M4ZGQxZDI3ZjJmIiwic2l0ZU93bmVySWQiOiI5MTI0ZDQ0YS1kZmY5LTRiNDEtYjZjZi0wZWEzMmE5MjAxYjkifQ',
      ),
    ))
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (26)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (33)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/controllers/SiteController.php (59)
    2017/05/21 06:02:05 [trace] [application] IdClass name : IdPub viewMode: site
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/models/AppIdBuilder.php (41)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/protected/controllers/SiteController.php (59)
    in /var/lib/server/oplympus/app-root/runtime/repo/reviewme/index.php (13)

When using the codec plain i get each single line as a separated document and logs of grok parse failures. So I tried the multiline as per original page documentation and in my case it makes sense to merge the lines with no timestamp back into the previous as you can see from the sample log. So

input { 
  file {
    path => "/tmp/logs/application*.log"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    #codec => plain {
	#	charset => "ISO-8859-1"
	#}
	codec => multiline {
	  # Grok pattern names are valid! :)
	  pattern => "^%{TIMESTAMP_ISO8601} "
	  negate => true
	  what => previous
	  charset => "ISO-8859-1"
	}
  }
}

filter {
  grok {
    match => { "message" => "%{DATESTAMP:timestamp} \[%{LOGLEVEL:log-level}\] \[%{WORD:app}\]" }
  }
  date {
    match => [ "timestamp" , "yyyy/MM/dd HH:mm:ss", "ISO8601" ]
  }
}

output {
  elasticsearch { 
    hosts => ["http://127.0.0.1:9200"]
    #protocol => http
    index => "myindex"
  }
  stdout { codec => rubydebug }
}

And when i query the index in Kibana it returns nothing. So querying Elastic directly with GET myindex/_search gives me documents with content

"message": """
2017/03/08 04:43:56 [info] [application] Cache set to:  CApcCache. Enabled : Yes. Default Expire: 604800
2017/03/08 04:43:56 [info] [application] Not Returning cached google place my_place_FDgdfgDFgdFGdfgdfgfer : cachedPlace

So it is not detecting the timestamp, but I also tried with DATESTAMP and this should really work as far as i can see from several resources like https://github.com/elastic/logstash/blob/v1.4.0/patterns/grok-patterns

Could you give me your suggestion on what is happening and how to make multiline work well?

it seems the codec multi line isn't matching because the timestamp has the year at the start and none of the patterns in https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns beings with the year, other than DATESTAMP_EVENTLOG and TIMESTAMP_ISO8601, but the expect different separators.

You can just do pattern => "^%{YEAR}" in the grok pattern. from the looks of it only the head line of log entry starts with a year.

Then in grok you can something like (?<timestamp>\d\d\d\d/\d\d/\d\d %{TIME}) to match the timestamp field.

I eventually worked this out as you just mention the timestamp is wrong so I loaded a custom pattern to overriede DATE

DATE_YMD %{YEAR}/%{MONTHNUM}/%{MONTHDAY}
DATE %{DATE_US}|%{DATE_EU}|%{DATE_YMD}

I have not tried yet as I did temporarely reverted to file codec, but I will give it a shot tonight again. Thank you

1 Like

It looks like DATESTAMP was not working for some reason (maybe the multiline coded does not recognise extra patterns?) but using DATE it worked. I still see some errors but most of the data if processed correctly now

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.