Kibana - Aggregations over redundant documents

gferrette · May 2, 2019, 8:37pm

Hello!

I'm trying to denormalize a Business Inteligence database to elasticsearch, but i'm having some issues to build some visualizations inside Kibana because of the documents duplicates.

For example, for the set of documents below:

 "vm.nome_maquina" => "VM1",
  "dominio.descricao_dominio" => "ROOT",
           "vm.tamanho_disco" => "10",
               "ip.descricao" => "XXX.XXX.XXX.XXX",
 "datacenter.nome_datacenter" => "DC",
 "datacenter.tipo_capacidade" => "CPU"

 "vm.nome_maquina" => "VM1",
  "dominio.descricao_dominio" => "ROOT",
           "vm.tamanho_disco" => "10",
               "ip.descricao" => "XXX.XXX.XXX.XXX",
 "datacenter.nome_datacenter" => "DC",
 "datacenter.tipo_capacidade" => "CPU"
 
 "vm.nome_maquina" => "VM1",
  "dominio.descricao_dominio" => "ROOT",
           "vm.tamanho_disco" => "10",
               "ip.descricao" => "XXX.XXX.XXX.XXX",
 "datacenter.nome_datacenter" => "DC",
 "datacenter.tipo_capacidade" => "CPU"
 
 
 "vm.nome_maquina" => "VM2",
  "dominio.descricao_dominio" => "ROOT",
           "vm.tamanho_disco" => "10",
               "ip.descricao" => "XXX.XXX.XXX.XXX",
 "datacenter.nome_datacenter" => "DC",
 "datacenter.tipo_capacidade" => "CPU"
 
 "vm.nome_maquina" => "VM2",
  "dominio.descricao_dominio" => "ROOT",
           "vm.tamanho_disco" => "10",
               "ip.descricao" => "XXX.XXX.XXX.XXX",
 "datacenter.nome_datacenter" => "DC",
 "datacenter.tipo_capacidade" => "CPU"
 
 "vm.nome_maquina" => "VM2",
  "dominio.descricao_dominio" => "ROOT",
           "vm.tamanho_disco" => "10",
               "ip.descricao" => "XXX.XXX.XXX.XXX",
 "datacenter.nome_datacenter" => "DC",
 "datacenter.tipo_capacidade" => "CPU"

I'm trying to create a bar visualization and aggregate by terms of the field dominio.descricao_dominio, and after that sum the values of the field vm.tamanho_disco by unique value of vm.nome_maquina.

So for the example above, i have 3 duplicated documents containing vm.nome_maquina, dominio.descricao_dominio and vm.tamanho_disco. The value that i'm trying to goal is "20", cause that is the sum of vm.tamanho_disco of the first and second document , cause both contains the "ROOT" value on field dominio.descricao_dominio.

Please, do you guys have an idea how to achieve that on Kibana?

Follow attached the visualization that i tried to do, but i wasnt able to sum the values.

Thanks in advance! ticket_elastic_2

christophilus · May 3, 2019, 6:04pm

It looks like you've got the X axis configured correctly, but you need to use "Sum" instead of "Top Hit" on the Y axis.

gferrette · May 3, 2019, 6:35pm

Hello Chris!

Thanks for replying!

When i use "Sum" instead of "Top Hit", Kibana sums all redundant values, for example, if i have 30 documents, kibana will sum "10*30" per "vm.nome_maquina" (300 for VM1 and 300 for VM2), as the image attached.

Do i have to separate the redundant values in different indexes like a relational database does?

Thanks again!

christophilus · May 3, 2019, 6:41pm

Ah! I missed your "unique values" constraint. In that case, "top hit" is correct, and your original is working correctly. Your total is 20 in the screenshot you gave. There's the 10 from VM1 and the 10 from VM2. It's just that the bar is split, and maybe you don't want the bar to be split? I'm not aware of a good way to display this data in a unified bar.

gferrette · May 3, 2019, 7:06pm

Hello Chris!

Thanks for quick replying!

Yes, i want the bar to be not split. If there is not a way to do that, i thought to separate the duplicated values (VM1 and VM2) in different indexes, so the value "10" for VM1 will be indexed only once on the other index. After that, i was thinking to create an index pattern to merge those two indexes. As the two indexes will have the same field name and with same type "vm.nome_maquina", it will be possible to "join".

Is it a good way?

Thanks again!

Gabriel.

christophilus · May 8, 2019, 2:33pm

I don't quite follow your latest comment, but I say, play around and see if you can get that to work!

system · June 5, 2019, 2:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Remove duplicate documents from a search in kibana4 Kibana	2	6861	July 6, 2017
Remove Duplicate records Kibana	3	4503	October 8, 2018
Duplicate documents Kibana	3	16	August 22, 2024
Delete API for duplicate records Kibana	3	223	September 23, 2021
Duplicate entries in Kibana - but Showing unique _id for each entry Kibana	4	3563	July 6, 2017

Kibana - Aggregations over redundant documents

Related topics