Logstash Aggregate Filter output different result each time

I have 2 input files like below -
inbound.txt:
17:03:56.662 53897 55931
17:03:56.737 53898 55932
17:03:56.814 53899 55933
17:03:56.889 53900 55934
outbound.txt:
17:03:56.887 77307 55931
17:03:56.953 77308 55932
17:03:57.028 77309 55933
17:03:57.105 77310 55934
17:03:57.180 77311 55935

My aggregate filter part in Logstash config is like below. Purpose is to merge the lines with same 3rd ID ( I named it FIXinID ) into one events. After that I can calculate the time gab for each InID. My current trouble is, the aggregate filter sometimes worked perfect, sometimes not. I don't know how to finger it out. Can anyone help advise please?

filter {
    if [type] == "inbound" {
          grok {match => {"message" =>"%{TIME: TIME1}\s+.*\s+%{NOTSPACE: FIXinID}"}}
     }

   if [type] == "outbound" {
          grok {match => {"message" =>" %{TIME: TIME6}  {%NOTSPACE: OutID}  %{NOTSPACE: FIXinID} " }}
     }

   mutate { remove_field => ["@version","host","message","path"]  }

   if [type] == "inbound" {
       aggregate { 
                      task_id => "%{FIXinID}"
                      code => "map['time1_a']=event.get('TIME1');"
                      map_action => "create"
                          }
    }
  
    if [type] == "outbound" {
           aggregate { 
                      task_id => "%{FIXinID}"
                      code => "event.set( 'time6_a', event.get('TIME6') );   event.set( 'time1_a', map['time1_a'] );"
                      map_action => "update"
                      end_of_task =>true
                      timeout => 120
                          }
    }
}

The combination result sometimes is perfect that all events merge successfully except the alone one FIXinID=55935. However, sometimes only 3 or 2 inbound/outbound events can combine together.

Have you set pipeline.workers to 1? Are you running a version where you need to disable java_execution (anything from 7.0 to 7.6).

1 Like

I tried "-w 1" in my execution cmd, it worked! Thanks Badger so much!

I encountered similar problem again while using a mass of test data. This time to set pipeline.workers=1 and pipeline.java_execution = false in logstash.yml cannot help anymore.

So my current status is - to test with 10/20/30 lines data, it works perfect. However, to use 100+ lines test data, the output results become random, for example, both 100 lines test data in 2 files, run the logstash config twice, the first time 87 lines can matched successfully and left 13 records not matched. The the 2nd time, 58 records can matched but other 42 records not matched.

What's going wrong in my config or is it a bug when logstash aggregate filter handle mass of data? Please advise and help!

Hello Badger, I encountered similar problem with above config. could you please advise? thanks in advanced!

If you have set pipeline.workers=1 and pipeline.java_execution = false I am not aware of any other reason why data might not be ordered.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.