Sum with unique object

For example if i have csv file with fields;
Car, parking,date of entry
V1, p1, 12-03-2020 12:30
V1, p1, 12-03-2020 11:30
V2, p2,12-03-2020 10:30
V2, p2,12-03-2020 10:45
How can i calculate number of time V1 enter to p1? And the same thing for the other cars ?
Please à y ideas can helps me !

Use an aggregate filter.

Thanks for the quick response, I tried with this code, it works
aggregate {
task_id => "%{Car}"
code => " map['sum'] ||= 0
map['sum'] += 1
event.cancel"
push_map_as_event_on_timeout => true
timeout_task_id_field => "Car"
timeout => 10
}

my problem now is that this code eliminates the other columns for example date of entry of each car.
so how can I leave the other data related to each car and just add in new column the number of times it enter to this parking

If there are additional columns you want to add to the event then add them to the map.

OK, I added the other columns but I have a question why it does not display all the entry dates for each car just it takes only one entry date for each car ?? ??
is there another method to leave all the data like in my csv file and just add a column which calculates the sum ??

Well, event.cancel is optional. If you remove that you will get all the original events as well as events with the aggregated data.

Alternatively, for each value of the task id create an array, and add the event hash to it

a ||= []
a << event.to_hash

then split the array into separate events using a split filter once the aggregation is done.

Thank you so much for your help; really i appreciate that.
I understood correctly and I tried to do everything, finally the initial data is stored separately and the aggregated data in sequence is stored with sum
my last question is can I eliminate @timestamp and @version in aggregated data?how can i added to this code?
aggregate {
task_id => "%{Car}"
code => "map['Car'] ||= [ ]
map['Car_registration'] << {'Car' => event.get('Car')}
map['sum'] ||= 0
map['sum'] += 1
"
push_map_as_event_on_timeout => true
timeout_task_id_field => "Car"
timeout => 5
}
split {
field => "Car"
}

I don't think those field are optional, but you could try

mutate { remove_field => [ "@timestamp", "@version" ] }

It works ,Thank's you are so helpful :wink:

can i ask one more question please !!
for each csv file it calculates the number of times that car uses this parking but when I add another file it calculates the same thing separetly ..
if I recover the files in real time and i want for each time when I add a file i want him to calculate sum together.
Any help ?

The aggregate filter will only aggregate data that arrives within the timeout. Extending the timeout may help. Otherwise I think you would need to aggregate the aggregates in elasticsearch.

I think extension of timeout will not help me because I received the files every dayd but the second solution aggregate the aggregates how can i do that, i want to try, can you explicate more!!

That is really an elasticsearch question.

Hi @Badger , i see this topic Extract Month and Year from date field
So, I have the same trouble and i want a new field contain for example 2020-02
i try this code
grok { match => { "Date_of_entry" => "^%{YEAR:year}-%{MONTHNUM}" } }
but he extract just the month so what shoud i add to this code to get result like this 2020-02?
can you answer me please?

I would expect that to extract the year into [year]. It would not extract the month, but it would fail to match if the month was not present. If you want both year and month in a single field you could use

        pattern_definitions => { "YM" => "%{YEAR}-%{MONTHNUM}" }
        match => { "message" => "^%{YM:yearAndMonth}" }

So beautiful!! it works, thank you so much :heart_eyes: :heart_eyes: :heartbeat:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.