Problem finding duplicate documents via kibana? Issue with unique count!


I am a bit confused:

All our documents are shipped via filebeat. Filebeat allways adds the meta information field offset (position in logfile) - so far my understanding.

If I filter for a single logfile, then the unique count of offset field should be identical to the count, as long I have no duplicates. Am i correct?

But kibana shows different values. Lower number on unique counts means that there must be duplicates - my understanding.

Now I want to search the documents, where an offset appears more often, so I would like to run a term query on offset field and filter / sort for the number of occurrence:

When adding additional json options to the term aggregation

  "min_doc_count" : 2

I get no results.

So the result is not what I expected. Can anyone shed some light on it?

Thanks, Andreas

Maybe the first entry from a given logfile has no offset ? That's the only thing I can think of that would account for this.

but we have a difference of 19, not of 1.

I checked against the input logfile. So the count seems to be correct. So do we have an issue with unique count aggregation in ES / Kibana?

The aggregation is done in ES, so if it is an agg issue it would be there. This is puzzling.

moved to elasticsearch
stack version is 6.2.3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.