Help understanding "order by" metric


(Richie) #1

Hi there, I've been tearing my hair out in an attempt to get a heatmap to behave exactly as I'd like.

My goal is to display the max timestamp for documents, grouped by a specific term. This appears to work well when I order the Y-axis by the term itself (i.e. alphabetically):

But, I really want to order the Y-axis by the metric itself so that groups with a lower max timestamp appear at top of the heatmap. However, when I change the ordering to use the metric it appears to change the actual values that I'm seeing for any given group (see the highlighted row above and below - same bucket, different values), which is not what I'd expect from an "ordering function":

I'm sure this is a fundamental misunderstanding on my part but would really like to get to the bottom of it. Any help in explaining what I'm seeing would be greatly appreciated.

Thanks


(Lee Drengenberg) #2

Hi Richie,

Don't feel bad. Some of these behavior can be pretty confusing. I think the difference you're seeing in values is caused by some optimizations Elasticsearch does. If I understand it correctly, if the data in your gc-* is split over multiple shards (which is the normal case) then the top 20 in your sort (either by Term of by Max timestamp) can be inaccurate as it's combined the results from each shard.

Read through the sort example here;
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

If you still want more info on the topic you might want to post in the Elasticsearch channel of the forums.

Thanks,
Lee


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.