How to remove duplication when aggregating data for visualization?

tinrik · February 1, 2023, 10:10am

Hi!

Based on answers from other posts, I learned that I need to upload "normalized data" to Kibana, which may come at the cost of sending duplicated data:

{ file: foo, project: foo, count: 123, id: 1 } 
{ file: foo, project: bar, count: 123, id: 1 }    

{ file: foo, project: foo, count: 321, id: 2 } 
{ file: foo, project: bar, count: 321, id: 2 } 

{ file: bar, project: foo, count: 111, id: 1 } 
{ file: bar, project: bar, count: 111, id: 1 } 

{ file: bar, project: foo, count: 222, id: 2 } 
{ file: bar, project: bar, count: 222, id: 2 }

Let's say I want a Table summing the count for all given ids, disregarding the duplicated entry due to project (which is needed somewhere else). I.e. the table should produce:

File                        Sum of count across id
foo                        123 + 321 = 444
bar                        111 + 222 = 333

However if I use a naive Sum aggregation, it will double-count the entries for different projects, producing e.g. 888 and 666 respectively.

Is there a good way to achieve what I want? Thanks!

system · March 1, 2023, 10:11am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana Metrics Visualization : Sum Aggregation, applied to Unique (field based) objects Kibana	2	2786	July 6, 2017
Aggregation with unique count and sum: how to do it? Kibana	3	2777	June 25, 2020
Counting the duplicates and non-duplicates of a count aggregation Kibana	3	2034	September 3, 2020
Sum aggregation along with Unique Count aggregation Kibana	5	5401	July 6, 2017
Treating duplicate key values as one Kibana	2	531	November 11, 2019

How to remove duplication when aggregating data for visualization?

Related topics