NOTE: I managed to figure out the problem, and it had nothing to do with Logstash. It was a problem with my text editor and how it saves files, which caused Logstash to process the same lines multiple times. In my case I was using WebStorm, but the same problem would be around with other JetBrains IDEs (PhpStorm, PyCharm, IntelliJ, ...). Switched to a good old text editor and everything worked. D'oh!
I have the following simple pipeline configuration for handling Java stack traces, being multiline events.
input {
file {
path => "/Users/Andy/Desktop/logstash-5.6.3/java_errors.log"
start_position => "beginning"
codec => multiline {
pattern => "^(\s+|\t)|(Caused by:)"
what => "previous"
auto_flush_interval => 5
}
}
}
output {
stdout {
codec => rubydebug
}
}
And I have the following test file, which is the one that is referenced within the file input.
Sep 9, 2017 6:14:31 AM ERROR com.example.blog.service.PostServiceImpl: Post Not Found (ID: 1)
at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:71)
at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:55)
at com.amazonaws.auth.QueryStringSigner.sign(QueryStringSigner.java:83)
at com.amazonaws.auth.QueryStringSigner.sign(QueryStringSigner.java:46)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:238)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:170)
Caused by: com.amazonaws.AmazonClientException: Unable to calculate a request signature: Empty key
at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:90)
at com.amazonaws.auth.AbstractAWSSigner.signAndBase64Encode(AbstractAWSSigner.java:68)
... 14 more
Caused by: java.lang.IllegalArgumentException: Empty key
at javax.crypto.spec.SecretKeySpec.<init>(SecretKeySpec.java:96)
at com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)
... 15 more
With the above, I start up Logstash with the following command:
bin/logstash -f config/pipelines/pipeline.conf --config.reload.automatic
When the pipeline is started up, the following event is processed:
{
"path" => "/Users/Andy/Desktop/logstash-5.6.3/java_errors.log",
"@timestamp" => 2017-10-10T21:31:56.658Z,
"message" => "Sep 9, 2017 6:14:31 AM ERROR com.example.blog.service.PostServiceImpl: Post Not Found (ID: 1)\n\t[MORE STACK TRACE HERE]\n\tat com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)",
"tags" => [
[0] "multiline"
]
}
The last line is missing from the event because the file does not end with a newline, so this is expected. If I then add a new stack trace to the test file, I get the following output immediately:
{
"path" => "/Users/Andy/Desktop/logstash-5.6.3/java_errors.log",
"@timestamp" => 2017-10-10T21:34:48.367Z,
"message" => "\t... 15 more"
}
And the following after five seconds (the next stack trace without the last line):
{
"path" => "/Users/Andy/Desktop/logstash-5.6.3/java_errors.log",
"@timestamp" => 2017-10-10T21:34:53.637Z,
"message" => "Sep 9, 2017 6:14:31 AM ERROR com.example.blog.service.PostServiceImpl: Post Not Found (ID: 2)\n\t[MORE STACK TRACE HERE]\n\tat com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)",
"tags" => [
[0] "multiline"
]
}
This behavior repeats every time a new stack trace is added. This is not a problem per se, because if I just make sure the file always ends with a newline, everything works as intended. However, the behavior is different with Logstash 6.0.0 RC1. With this version, starting up the exact same pipeline outputs the following:
{
"path" => "/Users/Andy/Desktop/logstash/java_errors.log",
"@timestamp" => 2017-10-10T21:40:27.158Z,
"message" => "Sep 9, 2017 6:14:31 AM ERROR com.example.blog.service.PostServiceImpl: Post Not Found (ID: 1)\n\t[MORE STACK TRACE HERE]\n\tat com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)",
"tags" => [
[0] "multiline"
]
}
So far the same result; the last line of the stack trace is missing. Then I add another stack trace to the file, and the following output appears immediately:
{
"path" => "/Users/Andy/Desktop/logstash/java_errors.log",
"@timestamp" => 2017-10-10T21:42:58.895Z,
"message" => "Sep 9, 2017 6:14:31 AM ERROR com.example.blog.service.PostServiceImpl: Post Not Found (ID: 1)\n\t[MORE STACK TRACE HERE]\n\tat com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)\n\t... 15 more",
"tags" => [
[0] "multiline"
]
}
This is the first event of the file, this time including the last line. Five seconds later, the following output appears:
{
"path" => "/Users/Andy/Desktop/logstash/java_errors.log",
"@timestamp" => 2017-10-10T21:43:04.079Z,
"message" => "Sep 9, 2017 6:14:31 AM ERROR com.example.blog.service.PostServiceImpl: Post Not Found (ID: 2)\n\t[MORE STACK TRACE HERE]\n\tat com.amazonaws.auth.AbstractAWSSigner.sign(AbstractAWSSigner.java:87)",
"tags" => [
[0] "multiline"
]
}
The above is the second stack trace, but without its last line.
If I add more stack traces to the file, the above repeats, so essentially processing the same stack traces twice with and without the last line. Pretty funky.
I know I can handle the stack traces in a simpler way by using the "negate" option, but I am trying to demonstrate this approach for a tutorial. I looked through the changelog for version 6.0.0 and couldn't find anything that should cause this. Also looked at the "multiline" codec, but it doesn't seem like anything changed.
Basically I am trying to figure out why the two versions of Logstash produce different results. Version 5.x produces what I expected, and version 6.0.0 RC1 doesn't. Does anyone know the explanation for the different behavior?
Thanks so much in advance, and sorry for the lengthy post, but I wanted to include as much detail as possible.