Explicit Unique Count for Business Intelligence tasks: no chance?

Mikhail · February 6, 2018, 12:30pm

I have come here from SQL world and have fallen into Unique Count trap called cardinality.
Very disappointed that unique count here is not the same as count distinct in SQL.
https://www.elastic.co/guide/en/elasticsearch/guide/current/cardinality.html

Is there any chance to get the explicit unique count in Kibana? Any approximations are not acceptable for my task.

Some explanation: I have built SQL data mart with large amount of fields i.e. have joined a lot of SQL entities to get all desired data and all labels in one SQL data mart. And have got some expected multiplication of rows due to the ID of some entity can appear in my mart in different rows. And have transferred these data into one ES index via Logstash. Then, unique count by ClientID field in Kibana, comparison with count (distinct ClientID) executed on SQL mart, and bad mood for at least the rest of the day as a result.

thomasneirynck · February 7, 2018, 9:52pm

hi @Mikhail,

imho there's no quick shortcut for this. This limitation is informed by Elasticsearch's distributed architecture and the choice for using the HyperLogLog algorithm to compute the unique count.

I would open an enhancement request in Elasticsearch repo for this: https://github.com/elastic/elasticsearch/issues/new

Mikhail · February 9, 2018, 9:48am

hi @thomasneirynck,

Thank you for support on this. I have created feature request there:

Hope it will be implemented.

Mikhail · February 16, 2018, 4:52pm

So, request for this feature is Closed on github. This will not be implemented.
This means that if you want to build some BI system using ELK and this system requires explicit counts then IMHO you have three options:

Use existing Unique count metric and admit approximation.
Organize your source data in the way when you will have no multiplication of the same EntityID within one source data mart where you whant to count unique values. But in that case, what about relations between indicies (joins) on Kibana level? I don't know how to reach this. If someone knows then please suggest.
Do not use ELK for your BI system.

I will go forward with option 1 but will not use Kibana for counting where explicit unique count is required.

system · March 16, 2018, 4:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana alternate way to remove duplicates or get precise Unique Count Kibana	3	3223	April 25, 2019
Problem with unique count and cardinality Elasticsearch	4	185	May 1, 2024
Different counts of unique values in Kibana and CSV-export of the raw data Kibana visualisation	4	606	April 13, 2023
Unique count on a term more than documents? Is that possible? Elasticsearch	4	1204	June 18, 2018
Rollup job with distinct/unique count Elasticsearch	2	680	February 20, 2020

Explicit Unique Count for Business Intelligence tasks: no chance?

Related topics