I count unique values for various log variables (Visualize->Metric->unique count). But the number of unique values differs slightly, depending on whether I use Kibana or export the same database as .csv-file and edit it with another statistics program (R, Python).
Example:
Total hits in Kibana 3567 -> unique count of user_id: 3267
Total count in .csv-file/Python 3567 -> unique count of user_id: 3275
Hi @Mario_Lie ! Sorry you didn't get a reply here sooner.
Under the hood, Kibana uses Elasticsearch's cardinality aggregation to generate that unique count number. Since Elasticsearch is a distributed data store, computing a true cardinality is difficult and precision requirements need to be balanced with cluster load.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.