ELPASED Filter - Works differently 6.8.1 vs 7.1 - Possible Bug?

I have noticed something interesting.
I have a configuration that uses the logstash-filter-elapsed plugin (latest version) to get the time difference between two log entries.

Log entries:

2019.04.25|09:36:49.850||B8CA3A947330-5CC1B251-1||B8CA3A947330-5CC1B251-146|(null)|MainThread|Generic|Begin|check_patient_for_completeness()|(null)|
2019.04.25|09:36:50.128||B8CA3A947330-5CC1B251-1||B8CA3A947330-5CC1B251-146|(null)|MainThread|Generic|End|check_patient_for_completeness()|(null)|

Logstash Config:

...MORE ABOVE
if "_grokparsefailure" in [tags]{
grok {
remove_tag => ["_grokparsefailure"]
patterns_dir => ["/etc/logstash/conf.d/patterns"]
match => { "message" => "%{YEAR:log_year}\.%{MONTHNUM:log_month}\.%{MONTHDAY:log_day}\|%{TIME:log_time}\|\|%{SESSIONUID:SESUID}\|\|%{SESSIONUID:THREADUID}\|\(null\)\|%{WORD:ThreadName}\|%{WORD:Action}\|Begin\|check_patient_for_completeness%{GREEDYDATA}"}	
		add_tag => [ "CheckPatientForCompletenessStart"]
		add_field => { "event" => "CheckPatientForCompletenessStart" }
	}
}

if "_grokparsefailure" in [tags]{
	grok {
		remove_tag => ["_grokparsefailure"]
		patterns_dir => ["/etc/logstash/conf.d/patterns"]
		match => { "message" => "%{YEAR:log_year}\.%{MONTHNUM:log_month}\.%{MONTHDAY:log_day}\|%{TIME:log_time}\|\|%{SESSIONUID:SESUID}\|\|%{SESSIONUID:THREADUID}\|\(null\)\|%{WORD:ThreadName}\|%{WORD:Action}\|End\|check_patient_for_completeness%{GREEDYDATA}"}
		
		add_tag => [ "CheckPatientForCompletenessEnd"]
		add_field => { "event" => "CheckPatientForCompletenessEnd" }
	}
}
elapsed {
    start_tag => "CheckPatientForCompletenessStart"
    end_tag => "CheckPatientForCompletenessEnd"
    unique_id_field => "THREADUID"
}

I run the configs with the following command line parameters for 6.8.1:

/logstash-6.8.1/bin# ./logstash -f /etc/logstash/conf.d/performance_log/v2/performance.log.conf -w 1 -r

and for my 7.1 installation:

/usr/share/logstash/bin#  ./logstash -f /etc/logstash/conf.d/performance_log/v2/performance.log.conf -w 1 -r

However, I get different results!
The 6.8.1 correctly identifies the CheckPatientForCompletenessStart tag as the start and the CheckPatientForCompletenessEnd tag as the ending parameter.
However the 7.1 installation does not do this correctly and constantly displays "elapsed_end_without_start" tag.
The only thing that has changed between these two runs is the version of logstash.

This smells like a bug, but I'm not sure where. Maybe 7.1 is not properly interpreting the -w 1 flag correctly?

7.1 Example (notice how it seems to process the end but completely ignores the start):

{
     "@timestamp" => 2019-04-25T13:36:50.128Z,
"workstationName" => "ii-rs-hc-mam-04",
          "event" => "CheckPatientForCompletenessEnd",
           "tags" => [
    [0] "CheckPatientForCompletenessEnd",
    [1] "elapsed_end_without_start"
],
      "timestamp" => "2019 04 25 09:36:50.128",
       "SiteName" => "USA-SOMEWHERE-XX",
       "@version" => "1",
           "type" => "plain",
      "THREADUID" => "B8CA3A947330-5CC1B251-146",
        "message" => "2019.04.25|09:36:50.128||B8CA3A947330-5CC1B251-1||B8CA3A947330-5CC1B251-146|(null)|MainThread|Generic|End|check_patient_for_completeness()|(null)|\r",
         "Action" => "Generic",
           "host" => "wtlsuv403.sitename.com",
           "path" => "/var/log/perf_logv2/v3/client_performance_USA-SOMEWHERE-XX_ii-rs-hc-mam-04.log",
         "SESUID" => "B8CA3A947330-5CC1B251-1",
     "ThreadName" => "MainThread"
}
{
     "@timestamp" => 2019-04-25T13:36:49.850Z,
"workstationName" => "ii-rs-hc-mam-04",
          "event" => "CheckPatientForCompletenessStart",
           "tags" => [
    [0] "CheckPatientForCompletenessStart"
],
      "timestamp" => "2019 04 25 09:36:49.850",
       "SiteName" => "USA-SOMEWHERE-XX",
       "@version" => "1",
           "type" => "plain",
      "THREADUID" => "B8CA3A947330-5CC1B251-146",
        "message" => "2019.04.25|09:36:49.850||B8CA3A947330-5CC1B251-1||B8CA3A947330-5CC1B251-146|(null)|MainThread|Generic|Begin|check_patient_for_completeness()|(null)|\r",
         "Action" => "Generic",
           "host" => "wtlsuv403.sitename.com",
           "path" => "/var/log/perf_logv2/v3/client_performance_USA-SOMEWHERE-XX_ii-rs-hc-mam-04.log",
         "SESUID" => "B8CA3A947330-5CC1B251-1",
     "ThreadName" => "MainThread"
}

6.8.1 Example (Correct)

{
           "type" => "plain",
     "ThreadName" => "MainThread",
         "SESUID" => "B8CA3A947330-5CC1B251-1",
       "@version" => "1",
           "path" => "/var/log/perf_logv2/v3/client_performance_USA-SOMEWHERE-XX_ii-rs-hc-mam-04.log",
      "THREADUID" => "B8CA3A947330-5CC1B251-146",
          "event" => "CheckPatientForCompletenessStart",
           "host" => "wtlsuv403.sitename.com",
     "@timestamp" => 2019-04-25T13:36:49.850Z,
"workstationName" => "ii-rs-hc-mam-04",
         "Action" => "Generic",
      "timestamp" => "2019 04 25 09:36:49.850",
           "tags" => [
    [0] "CheckPatientForCompletenessStart"
],
       "SiteName" => "USA-SOMEWHERE-XX"
}
{
                   "type" => "plain",
             "ThreadName" => "MainThread",
                 "SESUID" => "B8CA3A947330-5CC1B251-1",
"elapsed_timestamp_start" => 2019-04-25T13:36:49.850Z,
               "@version" => "1",
                   "path" => "/var/log/perf_logv2/v3/client_performance_USA-SOMEWHERE-XX_ii-rs-hc-mam-04.log",
              "THREADUID" => "B8CA3A947330-5CC1B251-146",
                  "event" => "CheckPatientForCompletenessEnd",
                   "host" => "wtlsuv403.sitename.com",
             "@timestamp" => 2019-04-25T13:36:50.128Z,
        "workstationName" => "ii-rs-hc-mam-04",
                 "Action" => "Generic",
              "timestamp" => "2019 04 25 09:36:50.128",
                   "tags" => [
    [0] "CheckPatientForCompletenessEnd",
    [1] "elapsed",
    [2] "elapsed_match"
],
           "elapsed_time" => 0.278,
               "SiteName" => "USA-SOMEWHERE-XX"
}

In V7, where the java execution engine became the default, logstash does not preserve the order of lines in the file, even with '--pipeline.workers 1'. Thus the end is processed before the begin, so you get a "elapsed_end_without_start" tag.

For now you could set 'pipeline.java_execution: false' in logstash.yml, but I do not expect elastic to maintain two engines, eventually java will be the only option.

1 Like

Great answer thank you!
This will be a problem for a lot of my use-cases as beginning and end events are quite common in the logs that I parse.
Can you think of another way to get the same result without having to rely on the order of events?

I cannot think of a way to do it if order is not preserved. In fact the aggregate filter is no longer of much use if order is not preserved even when '-w 1'.

Here is a minimal example that shows the problem. The file /tmp/foo.txt
contains

2019.04.25|09:36:49.850||B8CA3A947330-5CC1B251-1||B8CA3A947330-5CC1B251-146|(null)|MainThread|Generic|Begin|check_patient_for_completeness()|(null)|
2019.04.25|09:36:50.128||B8CA3A947330-5CC1B251-1||B8CA3A947330-5CC1B251-146|(null)|MainThread|Generic|End|check_patient_for_completeness()|(null)|

With 7.1.1 and 1.8.0_201 the following configuration will consistently output the Begin line after the End line (and in the original configuration, which used an elapsed filter, it was clear they were processed through the pipeline in that order).

input { file { path => "/tmp/foo.txt" sincedb_path => "/dev/null" start_position => beginning } }
filter {
    grok {
        pattern_definitions => { "SESSIONUID" => "[-A-Z0-9]+" }
        match => { "message" => "%{YEAR:log_year}\.%{MONTHNUM:log_month}\.%{MONTHDAY:log_day}\|%{TIME:log_time}\|\|%{SESSIONUID:SESUID}\|\|%{SESSIONUID:THREADUID}\|\(null\)\|%{WORD:ThreadName}\|%{WORD:Action}\|Begin\|check_patient_for_completeness%{GREEDYDATA}"}
    }
    if "_grokparsefailure" in [tags]{
        grok {
            pattern_definitions => { "SESSIONUID" => "[-A-Z0-9]+" }
            match => { "message" => "%{YEAR:log_year}\.%{MONTHNUM:log_month}\.%{MONTHDAY:log_day}\|%{TIME:log_time}\|\|%{SESSIONUID:SESUID}\|\|%{SESSIONUID:THREADUID}\|\(null\)\|%{WORD:ThreadName}\|%{WORD:Action}\|End\|check_patient_for_completeness%{GREEDYDATA}"}
        }
    }
}
output { stdout { codec => rubydebug { metadata => false } } }

Very slight changes to this, like simplifying the grok pattern, or commenting out the if so that the second grok is unconditional will reverse the order of the events. I don't see how this can not be a bug, but I don't get to decide that.

1 Like

Agreed. I will try to simplify my config, but there are other parts to this pipeline that I did not paste.

Thank you for your help! Browsing the forum I've noticed the aggregation post as well. Definitely an issue with that plugin too. @joshdover @Bargs @shaunak Any chance this behaviour can be escalated as a bug?

Another interesting case. This involves a generator input rather than a file input, and the 2 events go through the pipeline in the "right" order, but appear in the output in the reverse order.

filter { json { source => "message" } }

input { generator { count => 1 lines => [ '{ "test": 1, "foo": "RECEIVED TSA REQUEST" }', '{ "test": 1, "foo": "FINISHED TSA REQUEST" }' ] } }

filter {
    if [foo] =~ "RECEIVED TSA REQUEST" {
        aggregate {
            task_id => "%{test}"
            code => "map['timestamp'] = event.get('@timestamp').to_f"
            map_action => "create"
        }
    }

    if [foo] =~ "FINISHED TSA REQUEST" {
        aggregate {
            task_id => "%{test}"
            code => "event.set('someField', map['timestamp'] - event.get('@timestamp').to_f)"
            map_action => "update"
            end_of_task => true
        }
    }
}
output { stdout { codec => rubydebug { metadata => false } } }

produces

{
       "foo" => "FINISHED TSA REQUEST",
 "someField" => -0.02200007438659668,
      [...]
      "test" => 1,
   "message" => "{ \"test\": 1, \"foo\": \"FINISHED TSA REQUEST\" }",
"@timestamp" => 2019-07-01T14:08:11.312Z
}
{
       "foo" => "RECEIVED TSA REQUEST",
[...]
      "test" => 1,
   "message" => "{ \"test\": 1, \"foo\": \"RECEIVED TSA REQUEST\" }",
"@timestamp" => 2019-07-01T14:08:11.290Z
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.