Just had a Visualisation/Functionality question regarding Kibana's "Show Terms" functionality.
I am visualising the top 5 Customers (Text, Keyword) by Transaction Amount Total (Sum, Number).
When I limit the number of customers to be displayed to 5, (Based on Descending Sum of Transaction Amount). I get:
Customer A (with 100k),
Customer B (with 90k),
Customer C (with 80k),
Customer D (with 70k),
Customer E (with 60k)
(Aliased for privacy reasons).
However, When I expand this to double check my working, to the top 1000, the first 5 entries are no longer A, B, C, D, E, and are instead
Customer A (100k),
Customer J (95k),
Customer B (90k),
Customer M (88k),
Customer Z (85k)
Any Idea why I would be getting incorrect information when i reduce the size of the Terms shown? I am using Customer_Name.Keyword field as per below,
If you have a field with quite high cardinality I believe this is expected as terms aggregations are approximate. How many customers do you have in the index? How many shards is this data distributed across?
If all the data is in a single index with a single shard it sounds strange that it changes. Do the shards have the same number of documents if you look at the _cat/shards API?
Sorry, I stand corrected, It somehow ended up on 5 shards (Primary)
get _cat/shards
indx 2 p STARTED 20110 6.1mb 172.30.60.11 jpK1TfI
indx 2 r UNASSIGNED
indx 3 p STARTED 19863 5.7mb 172.30.60.11 jpK1TfI
indx 3 r UNASSIGNED
indx 1 p STARTED 20126 6.1mb 172.30.60.11 jpK1TfI
indx 1 r UNASSIGNED
indx 4 p STARTED 19770 5.7mb 172.30.60.11 jpK1TfI
indx 4 r UNASSIGNED
indx 0 p STARTED 20131 5.7mb 172.30.60.11 jpK1TfI
indx 0 r UNASSIGNED
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.