Top3 in Line graph and Top values in Available fields are different

its-ogawa · July 22, 2022, 10:09am

I'm working on visualizing the statistics of logs collected by Kibana.
I would like to know how certain items in the logs have changed over time.

For example, I would like to aggregate the IP addresses of remote hosts from the access logs and see how the accesses from the most frequent addresses have changed.

I created a panel in Kibana's Dashboard and created a Line graph of the Top 3, but I am skeptical that they are really the Top 3.
I'm not sure if they are really Top 3, because they are different from the Top 3 values that are displayed when I select an item listed in the Available fields.
Why is this?
Are these aggregation rules different?

flash1293 · July 22, 2022, 10:26am

Top values are approximate - especially for high cardinality fields. In 8.3 we introduced an “accuracy mode” in Lens: Kibana highlights | Elastic Installation and Upgrade Guide [8.3] | Elastic
It sets the shard_size parameter on Elasticsearch side - it's heavier on the cluster resources but yields more accurate results.

More information about this: Terms aggregation | Elasticsearch Guide [8.3] | Elastic

its-ogawa · July 22, 2022, 10:30am

Thanks for answering.

It doesn't have to be so exact, but I am puzzled by these different values.

Does this mean that the methods of calculation are different from each other?

Even if Lens' Top 3 is also an estimate, is it the Top 3 calculated over that time period?

flash1293 · July 22, 2022, 12:45pm

It is calculated over the time period, but separately per node in the cluster (actually per shard of the queried index pattern).

More thorough explanation: Elasticsearch splits up data across multiple indices and shards sitting on different nodes - every shard is doing it's own processing, then sending the result to the coordinating node (the one Kibana talks to) which merges the individual nodes results and sends the response to Kibana. However ordering a list of terms can't be distributed across multiple nodes.

One option would be for each node to send the full list of local terms to the coordinating node which merges and orders all of these lists, then sends the top 3 to the client. However, if there are millions of terms in total, this would be super expensive as a lot of data would have to be transferred to the coordinating node (and it would also use up a lot of memory on the coordinating node).

So Elasticsearch isn't doing this, instead it just sends the top 15 terms or so per shard to the coordinating node. This keeps the memory usage and network traffic low, but it means the list can be wrong.

Consider the following example:

node one has the following data:

term	count
A	95
B	94
C	93
X	1
Y	2
Z	3

node two has the following data:

term	count
A	9
B	8
C	7
X	98
Y	97
Z	96

Both nodes send their top 3 (A-B for node one and X-Z for node two) to the coordinating node which merges and sorts the partial lists and sends the top 3 of that list to the client:

combined top 3 lists from both nodes

term	count
X	98
Y	97
Z	96
A	95
B	94
C	93

User sees:

term	count
X	98
Y	97
Z	96

However, if the data nodes had sent their top 6 lists respectively, the outcome would have been very different:

term	count
A	95 + 9 = 104
B	94 + 8 = 102
C	93 + 7 = 100
X	1 + 98 = 99
Y	2 + 97 = 99
Z	3 + 96 = 99

So the client would eventually see

term	count
A	104
B	102
C	100

This is what the "accuracy mode" is about - it increases the number of terms transferred from the data nodes to the coordinating node which is more expensive to calculate, but there is a smaller chance of providing the wrong top values

its-ogawa · July 22, 2022, 1:20pm

What a perfect answer! I'm so impressed!

Ah. So, in the case of multi-node, the above is the case?
Well. It's just an approximation.
I also found out that we can choose "accuracy mode" for exact calculation.
So we have a choice.

Now, I have another question.
I am using a single node cluster.
In this case, the above case will not happen, right?

flash1293 · July 22, 2022, 2:50pm

It will still happen if you have more than one index in your pattern or more than one shard in your index.

I glanced over this in my example above to simplify, but the logic I described happens per shard even if they are on the same node

its-ogawa · July 25, 2022, 1:10am

Hmmm... So it occurs in every shard.

But I collect logs on the same shard on a daily, and in many cases the rankings seem to be different even on a single shard.

system · August 22, 2022, 1:11am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rollup Job Visualisation - Top Ranking Items Kibana	6	298	March 8, 2021
Kibana - Incorrect Count # with Bucket Aggregation "Terms" Kibana	3	1191	August 28, 2019
Top 10 Filtering Appears inconsistent Kibana	7	329	April 17, 2019
Aggregate TopHit based on filter Kibana	3	302	June 22, 2021
I don't get the same result with Split series/Sub aggregation/Terms Kibana	3	273	September 2, 2021

Top3 in Line graph and Top values in Available fields are different

Related topics