With my data set I have seen a mismatch of data between ELK and my DB. For my purpose, I have used the cardinality aggregation to count the unique ids of a field but ran into some issues. The issues comes from the cardinality aggregation, specifically the precision_threshold is a default of 3,000. Anything over a precision_threshold of 3,000 will have an approximate value which is not what I intend to do with my data.
I recognize that the precision_threshold can be increased to an upper limit of 40,000 but that is far too low for my dataset (1 mil +~). Going through the logstash filter with the fingerprint function, I am able to create a new field with either 1 or 0 and using the summation aggregation. However, my problem comes from additional filters I need to sum using the same approach (using Sum instead of Cardinality).
My intent is to use ruby to make arrays of specific fields that all fall under a specific document id. I would appreciate some insight on this matter if anyone has tried a similar approach.