Problem finding duplicate documents via kibana? Issue with unique count!


I am a bit confused:

All our documents are shipped via filebeat. Filebeat allways adds the meta information field offset (position in logfile) - so far my understanding.

If I filter for a single logfile, then the unique count of offset field should be identical to the count, as long I have no duplicates. Am i correct?

But kibana shows different values. Lower number on unique counts means that there must be duplicates - my understanding.

Now I want to search the documents, where an offset appears more often, so I would like to run a term query on offset field and filter / sort for the number of occurrence:

When adding additional json options to the term aggregation

  "min_doc_count" : 2

I get no results.

So the result is not what I expected. Can anyone shed some light on it?

Thanks, Andreas

Maybe the first entry from a given logfile has no offset ? That's the only thing I can think of that would account for this.

but we have a difference of 19, not of 1.

I checked against the input logfile. So the count seems to be correct. So do we have an issue with unique count aggregation in ES / Kibana?

The aggregation is done in ES, so if it is an agg issue it would be there. This is puzzling.

moved to elasticsearch
stack version is 6.2.3

