File codec multiline issue

Hello,

I am unable to get file codec multiline to send stacktraces as one line to elastic search. I have log files similar to the lines:

2016-07-26 09:05:44,090 INFO Request ...log4j Information...
2016-07-26 09:05:44,107 INFO Response ...log4j information...

(Please note log4j information is a filler and there is actually request/response going to those lines.)

When I use this conf file, logstash still sends each line of stack trace as a separate message to elasticsearch. I want to send entire stack trace as a single message.

input {
file {
path => "/home/test/log4j.log"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
start_position => "beginning"
ignore_older => 0
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["eshost:443"]
}
}

Since every single line begins with a timestamp, Logstash will with this configuration produce one event per line. This shouldn't be surprising.

Under which conditions should lines be joined together? Should lines with "Request" after the loglevel be joined with the next line? Should lines with "Response" after the loglevel be joined with the previous line? Something else? Once you have described this in words we can try translating it into a multiline codec configuration.

How do you even know that all lines that you want to join are consecutive, i.e. that your log doesn't look like this:

2016-07-26 09:05:44,090 INFO Request ...log4j Information...
2016-07-26 09:05:44,100 INFO Something completely unrelated
2016-07-26 09:05:44,107 INFO Response ...log4j information...

Thanks Magnus for your reply! My issue is, logstash (original post) codec multiline matches even those lines that does not start with a timestamp.

Please give a few concrete examples of what kind of input you have and how you'd like to join those lines.

Ok, sorry for the abstract messages earlier. (please pardon, I dont know how to format these lines).
I expect the below stack trace to be sent as a single message to elasticsearch but it sends as 3 separate lines:
[Line 1]2016-07-26 09:38:22,776 ERROR [Req=NzT7wkS3oqU+o/CC] com.myexample.taskscheduler Unable to access dynamo repository
com.myexample.taskscheduler.common.db.exception.DataAccessException: Cannot do operations on a non-existent table (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ResourceNotFoundException; Request ID: 1bf02c66-a8a4-46a3-9377-289ce4ae12ce)
[Line 2] at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620) [spring-aop-4.2.6.RELEASE.jar:4.2.6.RELEASE]
[Line 3] at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609) [spring-aop-4.2.6.RELEASE.jar:4.2.6.RELEASE]

I expect these lines to match and it works great because both have timestamps.

[Line 1] 2016-07-26 09:05:44,580 INFO [Req=NzIKmGezlnvwXbZV] com.myexample.taskscheduler taskReferenceID=ea94efce-9dc2-402c-8ee0-4ba02fff2a88, attemptNumber=1, scheduledDateEpochInMillis=1469549144518, createdDateEpochInMillis=0, lastUpdatedDateEpochInMillis=0, maxAttempts=3, state=null, taskType=NearInstantTask, contextData=contextData, retryPolicy=LINEAR_BACKOFF, clientID=test, DurationMillis=10
[Line 2] 2016-07-26 09:05:44,580 INFO [Req=NzIKmGezlnvwXbZV] com.myexample.taskscheduler taskReferenceID=ea94efce-9dc2-402c-8ee0-4ba02fff2a88, attemptNumber=1, scheduledDateEpochInMillis=1469549144518, createdDateEpochInMillis=0, lastUpdatedDateEpochInMillis=0, maxAttempts=3, state=null, taskType=NearInstantTask, contextData=contextData, retryPolicy=LINEAR_BACKOFF, clientID=test, DurationMillis=11

The stacktrace lines should be joined into a single event with the configuration you have:

$ cat test.config 
input {
  stdin {
    codec => multiline {
      pattern => "^%{TIMESTAMP_ISO8601} "
      negate => true
      what => previous
    }
  }
}
output { stdout { codec => rubydebug } }
$ cat data 
2016-07-26 09:38:22,776 ERROR [Req=NzT7wkS3oqU+o/CC] com.myexample.taskscheduler Unable to access dynamo repository com.myexample.taskscheduler.common.db.exception.DataAccessException: Cannot do operations on a non-existent table (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ResourceNotFoundException; Request ID: 1bf02c66-a8a4-46a3-9377-289ce4ae12ce)
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620) [spring-aop-4.2.6.RELEASE.jar:4.2.6.RELEASE]
	at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609) [spring-aop-4.2.6.RELEASE.jar:4.2.6.RELEASE]
2016-07-26 09:38:22,776 FILLER TO FORCE LOGSTASH TO FLUSH
$ wc -l data 
4 data
$ /opt/logstash/bin/logstash -f test.config < data
Settings: Default pipeline workers: 8
Pipeline main started
{
    "@timestamp" => "2016-08-12T05:29:21.026Z",
       "message" => "2016-07-26 09:38:22,776 ERROR [Req=NzT7wkS3oqU+o/CC] com.myexample.taskscheduler Unable to access dynamo repository com.myexample.taskscheduler.common.db.exception.DataAccessException: Cannot do operations on a non-existent table (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ResourceNotFoundException; Request ID: 1bf02c66-a8a4-46a3-9377-289ce4ae12ce)\n\tat org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620) [spring-aop-4.2.6.RELEASE.jar:4.2.6.RELEASE]\n\tat org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609) [spring-aop-4.2.6.RELEASE.jar:4.2.6.RELEASE]",
      "@version" => "1",
          "tags" => [
        [0] "multiline"
    ],
          "host" => "lnxolofon"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}