Aggregate filter gets different output even with same input

I encountered similar problem again while using a mass of test data. This time to set pipeline.workers=1 and pipeline.java_execution = false in logstash.yml cannot help anymore.

I have 2 input files like below -
inbound.txt:
17:03:56.662 53897 55931
17:03:56.737 53898 55932
17:03:56.814 53899 55933
17:03:56.889 53900 55934
outbound.txt:
17:03:56.887 77307 55931
17:03:56.953 77308 55932
17:03:57.028 77309 55933
17:03:57.105 77310 55934

So my current status is to test with 10/20/30 lines data, it works perfect. However, to use 100+ lines test data, the output results become random, for example, both 100 lines test data in 2 files, run the logstash config twice, the first time 87 lines can matched successfully and left 13 records not matched. The the 2nd time, 58 records can matched but other 42 records not matched.

What's going wrong in my config or is it a bug when logstash aggregate filter handle mass of data? Please advise and help!

、、、
filter {
if [type] == "inbound" {
grok {match => {"message" =>"%{TIME: TIME1}\s+.*\s+%{NOTSPACE: FIXinID}"}}
}

if [type] == "outbound" {
grok {match => {"message" =>" %{TIME: TIME6} {%NOTSPACE: OutID} %{NOTSPACE: FIXinID} " }}
}

mutate { remove_field => ["@version","host","message","path"] }

if [type] == "inbound" {
aggregate {
task_id => "%{FIXinID}"
code => "map['time1_a']=event.get('TIME1');"
map_action => "create"
}
}

if [type] == "outbound" {
       aggregate { 
                  task_id => "%{FIXinID}"
                  code => "event.set( 'time6_a', event.get('TIME6') );   event.set( 'time1_a', map['time1_a'] );"
                  map_action => "update"
                  end_of_task =>true
                  timeout => 120
                      }
}

}
、、、

Are you relying on the relative order of events from two different inputs? I do not believe that is supported.

I imported data from 2 different files in one input, just like below. Not supported, either?

···
input {
file{
path => ["/vars/logstash/rowdata/inbound.txt"]
sincedb_path => ["/vars/logstash/test.sincedb"]
start_position => "beginning"
mode => "read"
type => "inbound"
}
file{
path => ["/vars/logstash/rowdata/outbound.txt"]
sincedb_path => ["/vars/logstash/test.sincedb"]
start_position => "beginning"
mode => "read"
type => "outbound"
}
}
···

That is two different input plugins. You cannot control the ordering of events.

Got it. Thanks Badger for the info! It”s very useful for me. Let me try combine those file into one input first and then run the Logstash config to aggregate all events.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.