Display all documents with duplicated value of a given field

jcaballero · May 8, 2019, 8:12pm

Hi

Kibana 5.4.1

Let's say I have this type of data in ElasticSearch:

name  <other fields>

aa   ...
bb   ...
cc   ...
bb   ...
dd   ...
ee   ...
ff   ...
aa   ...
gg   ...

and so on.
I would like to know if there is a way to use the query bar in Kibana, to display only those docs that have a common value of "name". Or, to say to opposite, filter those docs whose "name" is unique.
The result would be like this:

name  <other fields>

aa   ...
bb   ...
bb   ...
aa   ...

as 'aa' and 'bb' are the only values that show up more than once.
Is this doable?

thanks a lot in advance.
Cheers,
Jose

thomasneirynck · May 8, 2019, 9:04pm

hi @jcaballero,

the short answer there is maybe. It depends on what your limitations are.

You can create a data-table with a terms aggregation and set the min_doc_count parameter to 2.

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_minimum_document_count_4

Then, run the terms aggregation on the name field.

To set the min_doc_count field, you'll need to set it using the advanced settings, and add it in the JSON-field.

e.g.: this will only return terms that have at least 100 matching documents

jcaballero · May 8, 2019, 9:11pm

thanks @thomasneirynck for a so prompt response. I need to read carefully the documentation, as I never touched the advance settings, I don't want to break anything

jcaballero · May 8, 2019, 10:07pm

Hmm. Maybe there is an easier way, that does not requires playing with the delicate "advance settings"?
One of the other fields in my docs happens to be a counter. So the data actually looks like this:

name  counter   <other fields>

aa   1           ...
bb   1           ...
cc   1           ...
bb   2           ...
dd   1           ...
ee   1           ...
ff   1           ...
aa   2           ...
gg   1           ...

and, therefore, the result of the query I am looking for would be like this:

name  counter   <other fields>

aa   1           ...
bb   1           ...
bb   2           ...
aa   2           ...

Is it possible to leverage somehow the existence of that counter thru the query bar?
Something equivalent to (WARNING: pseudo-code) this?

SELECT * WHERE name = ( SELECT name WHERE counter > 1 )

warkolm · May 8, 2019, 10:46pm

You should just be able to add a filter for counter > 1 then.

jcaballero · May 9, 2019, 12:10am

would then I see also "aa 1" and "bb 1"?

warkolm · May 9, 2019, 12:14am

No because their counter values are greater than 1.

jcaballero · May 9, 2019, 12:24am

so it does not work as I need.

warkolm · May 9, 2019, 12:45am

But aren't aa 1 and aa 2 the same?

jcaballero · May 9, 2019, 12:54am

Nope. Sorry I was not clear. Note that in my examples, I included "other fields". They are different. And that is what I want to see. All fields for all documents with a value for "name" (any value) that appears more than once.

system · June 6, 2019, 12:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Retrieve duplicate data using Kibana search bar Kibana	2	8121	March 9, 2017
Count duplicated field value by doc Kibana	3	839	September 2, 2019
Compare fields of different documents in the same index Kibana	5	3747	March 19, 2020
{"min_doc_count"} on Unique Count aggregation Kibana	2	3169	May 1, 2019
Query to get one doc out of its duplicate Kibana	3	245	September 29, 2020

Display all documents with duplicated value of a given field

Related topics