Using aggregate filter to count number of events and sum field's values

Given that my last post was 43 minutes ago, I know you did not wait the 3600 seconds for the timeout to occur. You removed the event.cancel so you will see all the original data rows. Put the event.cancel back into the code and reduce the timeout.

3600 was just a try :grin:, I used less time.
previously i tried event.cancel() as following, but in this case there is no output and logstash terminated

aggregate {
    task_id => "%{aggregate_id}"
    code => "
		         map['company_count'] ||= 0
		 		 map['company_count'] +=1
			     event.set('company_count', map['company_count']) 
                 event.cancel()

		 "
       push_map_as_event_on_timeout => true

         timeout => 60
		 }

logstash terminal:

[2020-08-12T23:29:32,491][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-08-12T23:29:33,714][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"main", :thread=>"#<Thread:0x144a64c9 run>"}

You did disable java_execution like I said, right?

my logstash.yml config

 pipeline.id: main
 path.config: /ELK/logstash-6.6.0//bin/myconf.conf
 pipeline.workers: 1
 pipeline.java_execution: false

actually, without event.cancel() output is as following:

cm1-ser1-2020-08-09-1
cm1-ser1-2020-08-09-2
cm1-ser1-2020-08-09-3
cm1-ser1-2020-08-09-4
cm1-ser1-2020-08-09-5
cm1-ser1-2020-08-09-6
cm2-ser1-2020-08-09-1
cm2-ser1-2020-08-09-2
cm2-ser1-2020-08-09-3
cm2-ser1-2020-08-09-4
cm1-ser1-2020-08-09-7
cm1-ser1-2020-08-09-8
cm1-ser1-2020-08-09-9
cm1-ser1-2020-08-09-10
cm1-ser1-2020-08-09-11
cm1-ser1-2020-08-09-12
cm1-ser1-2020-08-09-13
cm1-ser1-2020-08-09-14
cm1-ser1-2020-08-09-15
cm1-ser1-2020-08-09-16
cm1-ser1-2020-08-09-17
cm1-ser1-2020-08-09-18
cm1-ser1-2020-08-09-19
cm1-ser1-2020-08-09-20
cm1-ser1-2020-08-09-21
cm1-ser1-2020-08-09-22
cm1-ser1-2020-08-09-23
cm1-ser1-2020-08-09-24
cm1-ser1-2020-08-09-25
cm1-ser1-2020-08-09-26
cm1-ser1-2020-08-09-27
cm1-ser1-2020-08-09-28
cm1-ser1-2020-08-09-29
cm1-ser1-2020-08-09-30
cm1-ser1-2020-08-09-31
cm1-ser1-2020-08-09-32
cm1-ser1-2020-08-09-33
cm1-ser1-2020-08-09-34
cm1-ser1-2020-08-09-35
cm1-ser1-2020-08-09-36
cm1-ser1-2020-08-09-37
cm1-ser1-2020-08-09-38
cm1-ser1-2020-08-09-39
cm1-ser1-2020-08-09-40
cm1-ser1-2020-08-09-41
cm1-ser1-2020-08-09-42
cm2-ser1-2020-08-09-5
cm2-ser1-2020-08-09-6
cm1-ser1-2020-08-09-43
cm1-ser1-2020-08-09-44
cm2-ser1-2020-08-09-7
cm2-ser1-2020-08-09-8
cm1-ser2-2020-08-09-1
cm1-ser2-2020-08-09-2
cm2-ser1-2020-08-09-9
cm2-ser1-2020-08-09-10
cm1-ser2-2020-08-09-3
cm1-ser2-2020-08-09-4
cm2-ser1-2020-08-09-11
cm2-ser1-2020-08-09-12
cm1-ser1-2020-08-09-45
cm1-ser1-2020-08-09-46
cm1-ser1-2020-08-09-47
cm1-ser1-2020-08-09-48
cm1-ser1-2020-08-09-49
cm1-ser1-2020-08-09-50
cm1-ser1-2020-08-09-51
cm1-ser1-2020-08-09-52

but with event.cancel(), there is no output.

Try adding an additional input to your configuration

input { exec { command => "/bin/true" interval => 864000 } }

Input data are from Elasticsearch, could you please explain more about this additional input?

I put my data into a txt file and change logstash configuration input so that read data using file input instead of elasticsearch input, then the aggregate filter works fine and count number of events. But, my data is in elasticsearch, why it doesn't work in the case of elasticsearch input? it is noted that the logstash script is making a query for old data (previous days) of elasticsearch, can this be the reason of lack of having output?

The additional input prevents the pipeline from terminating once the elasticsearch query has terminated. If the pipeline terminates I do not think the timeout will occur.

Many thanks. It works :slightly_smiling_face:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.