Queries regarding precision_threshold setting in cardinality aggregation


(Shailesh Vaidya) #1

Hello,

We are using cardinality aggregation in our project. We have around 1400000 documents each document consist filed named url. I want to count unique url wise document count. Hence I am using term aggregation along with cardinality. However I observed that count is not correct. From ELK documentation we can set precision_threshold to 40000 at max. Does that mean if unique count is more that 40000 then elastic query results will be in accurate. Could you please confirm.


(Mark Harwood) #2

Yes. But do you need the cardinality agg in this case? If you want to count hits on urls (assuming docs are web access logs) a terms agg will suffice. Only if you’re doing something like counting unique users per url would you need to also use the cardinality agg on something like the sessionid field


(Shailesh Vaidya) #3

Hi Mark,

Thanks for reply.

Document consist of Jenkins Build Details. Like URL, Build Number, pipeline-id,repo url etc.
Basically for each Jenkins build document is created. WIth each build we are also storing department name. Now using Elastic Query we want to extract department wise unique pipeline-urls.

As per your suggestion, we can use cardinality on pipeline-id filed also. However I guess issues will still remain then same. Is there any better way to achieve this.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.