Top3 in Line graph and Top values in Available fields are different

I'm working on visualizing the statistics of logs collected by Kibana.
I would like to know how certain items in the logs have changed over time.

For example, I would like to aggregate the IP addresses of remote hosts from the access logs and see how the accesses from the most frequent addresses have changed.

I created a panel in Kibana's Dashboard and created a Line graph of the Top 3, but I am skeptical that they are really the Top 3.
I'm not sure if they are really Top 3, because they are different from the Top 3 values that are displayed when I select an item listed in the Available fields.
Why is this?
Are these aggregation rules different?

Top values are approximate - especially for high cardinality fields. In 8.3 we introduced an “accuracy mode” in Lens: Kibana highlights | Elastic Installation and Upgrade Guide [8.3] | Elastic
It sets the shard_size parameter on Elasticsearch side - it's heavier on the cluster resources but yields more accurate results.

More information about this: Terms aggregation | Elasticsearch Guide [8.3] | Elastic

Thanks for answering.

It doesn't have to be so exact, but I am puzzled by these different values.

Does this mean that the methods of calculation are different from each other?

Even if Lens' Top 3 is also an estimate, is it the Top 3 calculated over that time period?

It is calculated over the time period, but separately per node in the cluster (actually per shard of the queried index pattern).

More thorough explanation: Elasticsearch splits up data across multiple indices and shards sitting on different nodes - every shard is doing it's own processing, then sending the result to the coordinating node (the one Kibana talks to) which merges the individual nodes results and sends the response to Kibana. However ordering a list of terms can't be distributed across multiple nodes.

One option would be for each node to send the full list of local terms to the coordinating node which merges and orders all of these lists, then sends the top 3 to the client. However, if there are millions of terms in total, this would be super expensive as a lot of data would have to be transferred to the coordinating node (and it would also use up a lot of memory on the coordinating node).

So Elasticsearch isn't doing this, instead it just sends the top 15 terms or so per shard to the coordinating node. This keeps the memory usage and network traffic low, but it means the list can be wrong.

Consider the following example:

node one has the following data:

term count
A 95
B 94
C 93
X 1
Y 2
Z 3

node two has the following data:

term count
A 9
B 8
C 7
X 98
Y 97
Z 96

Both nodes send their top 3 (A-B for node one and X-Z for node two) to the coordinating node which merges and sorts the partial lists and sends the top 3 of that list to the client:

combined top 3 lists from both nodes

term count
X 98
Y 97
Z 96
A 95
B 94
C 93

User sees:

term count
X 98
Y 97
Z 96

However, if the data nodes had sent their top 6 lists respectively, the outcome would have been very different:

term count
A 95 + 9 = 104
B 94 + 8 = 102
C 93 + 7 = 100
X 1 + 98 = 99
Y 2 + 97 = 99
Z 3 + 96 = 99

So the client would eventually see

term count
A 104
B 102
C 100

This is what the "accuracy mode" is about - it increases the number of terms transferred from the data nodes to the coordinating node which is more expensive to calculate, but there is a smaller chance of providing the wrong top values

2 Likes

What a perfect answer! I'm so impressed!

Ah. So, in the case of multi-node, the above is the case?
Well. It's just an approximation.
I also found out that we can choose "accuracy mode" for exact calculation.
So we have a choice.

Now, I have another question.
I am using a single node cluster.
In this case, the above case will not happen, right?

1 Like

It will still happen if you have more than one index in your pattern or more than one shard in your index.

I glanced over this in my example above to simplify, but the logic I described happens per shard even if they are on the same node

Hmmm... So it occurs in every shard.

But I collect logs on the same shard on a daily, and in many cases the rankings seem to be different even on a single shard.