Sorting issue on keyword

Hi there,

I'm having some sorting issues. Below an image of the sorting problem, I expect the keyword field to be sorted A, B, C, D, E, U.

However, it looks like when I set the time range to 12 hours or more, it usually (not always) works fine:

The associated mapping to the field:

        "mappings" : {
      "properties" : {
        ...
        "prediction" : {
          "properties" : {
            "global" : {
              "properties" : {
                "rating" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
            ...

Any help would be appreciated.

Hey @Xavier_Romero,

Based on the discussion in Incorrect Order by Descending in Kibana visualization, I expect that the legend order is dependent on the order of the aggregation results from Elasticsearch, and is unfortunately not something you can control independently of the aggregation result.

We have an issue tracking this here, which goes back quite a ways: https://github.com/elastic/kibana/issues/3118

Thank you @Larry_Gregory,

As for the status of the referenced issue I'd better not expect a fix in the coming years...

It's really frustrating that being Kibana so powerful in many aspects, it fails to achieve very simple and basic tasks like showing labels in the requested order.

Regards.

1 Like

This is not as much about the legend @Larry_Gregory, as about the 'Order By' being broken for split-series.

(i.e. forget about the 'legend' - it's not that important really).

There is a bug that can be shown using your demo here

Now - WinXP, count for 'error' (25), is bigger than count for 'login'(21) - but they come in different order.

Play with 'Ascending' vs 'Descending' and this seems persistent. I see this issue elsewhere in my Kibana graphs and noticed the following too:

  • looks like the sorting might be applied to 'majority' of events, thus outliers will be sorted in a wrong way (not individually)? Why??

  • changing value of 'Size' affects the issue (for some reason I see that going above 5 exposes it, but might be a coincident and this is really a result of the 'majority-sorting' above)

I belive odrer of labels simply follows this 'majority sorting'. And it would also explain why @Xavier_Romero can see it 'sometimes' ( I can see it on my graps 'sometimes' too: probably this is when 'majority' flips things in the 'wrong' way..

Could this be looked into / solved? If not, what is the best way to AVOID this issue and, for bar charts (not necessarily for charts like attached here) - make them 'show the truth' and not 'lie' about the order for points where this order changes? Some other charts that can be used?

Thank you!

Update:

Now there might be even a bigger bug there, when, in the above 'demo' example I reduce 'size' for X-axis aggregation to 4, chart goes awry.. and:

  • number of X-axis points does NOT decrease to 4 (should this be expected?)
  • some tags go missing (in IOS only 3 left, etc)

Again - play yourself here to see this in 'action'.

Seems not right to me, or it requires a really good explanation..

@formiaczek, thanks for clarifying your issue and linking to demos, that will help me quite a bit. I'm going to take a closer look at this in the next couple of days, and I'll follow up here with my findings.

1 Like

Thank you @Larry_Gregory!

One more thing to add, hopefully useful: this is something that I see back in ELK stack 6.5, so it's not likely something introduced recently.

Hello @Larry_Gregory, any findings or updates so far?
Thanks,
Lukasz

@formiaczek sorry for the delay in responding. I unfortunately don't have a definitive answer, but a teammate pointed me to a relevant discussion here: https://github.com/elastic/kibana/issues/17532#issuecomment-465452130

When using split series, he mentions that the sort order is determined by the ordering of the first aggregation that comes back from Elasticsearch.

In the demo link you provided above, I can see the following:

Screenshot for reference, showing error appearing first, even though it's not the "smallest" by count:

Request to Elasticsearch

{
  "aggs": {
    "2": {
      "terms": {
        "field": "tags.keyword",
        "order": {
          "_count": "asc"
        },
        "size": 7
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "machine.os.keyword",
            "order": {
              "_count": "desc"
            },
            "missing": "__missing__",
            "size": 8
          }
        }
      }
    }
  },
  "size": 0,
  "_source": {
    "excludes": []
  },
  "stored_fields": [
    "*"
  ],
  "script_fields": {
    "hour_of_day": {
      "script": {
        "source": "doc['timestamp'].value.getHourOfDay()",
        "lang": "painless"
      }
    }
  },
  "docvalue_fields": [
    {
      "field": "timestamp",
      "format": "date_time"
    },
    {
      "field": "utc_time",
      "format": "date_time"
    }
  ],
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_all": {}
        },
        {
          "range": {
            "timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2020-05-06T11:49:53.540Z",
              "lte": "2020-05-13T11:49:53.540Z"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

Response from Elasticsearch

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1701,
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "2": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "ios",
                "doc_count": 24
              },
              {
                "key": "win 8",
                "doc_count": 19
              },
              {
                "key": "win xp",
                "doc_count": 19
              },
              {
                "key": "osx",
                "doc_count": 15
              },
              {
                "key": "win 7",
                "doc_count": 11
              }
            ]
          },
          "key": "error",
          "doc_count": 88
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "win xp",
                "doc_count": 29
              },
              {
                "key": "ios",
                "doc_count": 16
              },
              {
                "key": "osx",
                "doc_count": 16
              },
              {
                "key": "win 7",
                "doc_count": 15
              },
              {
                "key": "win 8",
                "doc_count": 14
              }
            ]
          },
          "key": "login",
          "doc_count": 90
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "win xp",
                "doc_count": 37
              },
              {
                "key": "win 8",
                "doc_count": 34
              },
              {
                "key": "win 7",
                "doc_count": 27
              },
              {
                "key": "ios",
                "doc_count": 24
              },
              {
                "key": "osx",
                "doc_count": 24
              }
            ]
          },
          "key": "warning",
          "doc_count": 146
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "win xp",
                "doc_count": 66
              },
              {
                "key": "win 8",
                "doc_count": 63
              },
              {
                "key": "osx",
                "doc_count": 60
              },
              {
                "key": "win 7",
                "doc_count": 53
              },
              {
                "key": "ios",
                "doc_count": 51
              }
            ]
          },
          "key": "security",
          "doc_count": 293
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "win xp",
                "doc_count": 349
              },
              {
                "key": "ios",
                "doc_count": 255
              },
              {
                "key": "osx",
                "doc_count": 240
              },
              {
                "key": "win 8",
                "doc_count": 240
              },
              {
                "key": "win 7",
                "doc_count": 234
              }
            ]
          },
          "key": "info",
          "doc_count": 1318
        },
        {
          "3": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "win xp",
                "doc_count": 388
              },
              {
                "key": "osx",
                "doc_count": 277
              },
              {
                "key": "ios",
                "doc_count": 274
              },
              {
                "key": "win 7",
                "doc_count": 264
              },
              {
                "key": "win 8",
                "doc_count": 264
              }
            ]
          },
          "key": "success",
          "doc_count": 1467
        }
      ]
    }
  }
}

In the response that Kibana receives from Elasticsearch, we see that the outer aggregation (aggregating on tags.keyword) is sorted by count, ascending. This count is across all sub-buckets (machine.os.keyword). If we take out the details of the sub-aggs, we indeed see that the outer aggregation is sorted correctly, and reflects the order that we see rendered in the chart and legend.

{
  "aggregations": {
    "2": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "error",
          "doc_count": 88
        },
        {
          "key": "login",
          "doc_count": 90
        },
        {
          "key": "warning",
          "doc_count": 146
        },
        {
          "key": "security",
          "doc_count": 293
        },
        {
          "key": "info",
          "doc_count": 1318
        },
        {
          "key": "success",
          "doc_count": 1467
        }
      ]
    }
  }
}

So even though error is not the smallest bucket in the all of the sub-buckets, when you look at error across all sub-buckets, it has the smallest count relative to all others. So it is possible for outliers in the data to alter the way the buckets are sorted, which in turn impacts the way Kibana renders the visualization.

Thank you @Larry_Gregory, very good analysis and explanation!

Still though - this is not the only time this beeing discussed (and possibly rejected as a 'duplicate' etc.) and there were more discussions e.g. here:



To me the whole idea of splitting series, creating sub-buckets, is to 'make sense' of the data after being able to present it in a controlled way. Note that currently it is possible to:

  • re-arrange the order of buckets (and it is useful and affects the way data is being processed in subsequent buckets)
  • within each bucket (for most of the aggregations) apply ordering ( 'OrderBy') - again, expected as being done within the bucket it was set-up for (and not the most-outer/parent bucket).
  • It is also possible to derive ordering from other buckets ('metric')

It also creates an impression of being 'somewhat' similar to Splunk processing of Events as time-series 'streams' - at each step events within the stream get processed or re-arranged, producing results that can in turn be used by subsequent steps. I don't intend to compare it directly: these are both different systems, but it could be another reason why engineers who also worked with Splunk - get surprised by this behaviour in Elasticsearch.

Furthermore, current Elasticsearch behaviour is 'unpredictable' - the outliers can be seen with either: 'expected' or 'unexpected' order - all depending on what data was there in the 'first' bucket when results were calculated.

Now to me, given this, and all the previously raised issues that discuss this behaviour, and now your clear explanation, it seems that this behaviour is incorrect in current form, somewhat unexpected and confusing. (event though it is 'plain old behaviour that was there before and is probably most efficient one: Note, efficiency here could really be second to correctness).

But to not just complain, I think that if 'Order by' set against a given bucket is not applied within this bucket, either:

  • 'OrderBy' SHOULD NOT be available and accessible for sub-buckes at all (perhaps it could appear in the 'root/parent' bucket (upon addign sub-buckets) instead. This would make it unambiguously clear where & when it is applied in the processing,
  • 'OrderBy' is applied within the child buckets (most expected behaviour)
  • the combination of the two (i.e to preserve 'existing' behaviour but also allow for 'expected'): making 'OrderBy' have at least two (or 3) options, e.g.:
  1. Apply in current bucket
  2. Apply to most outer (root) bucket
  3. More 'generic' version of the above: 'Apply to 'XX' bucket' where XX is 'current', 'root' or any of the buckets on the path from 'current' to 'root'.

Otherwise, when someone really depends on the behaviour (or cares) in which things are sorted: and because 'buckets' are 'grouping' items: sorted within that grouping is something that is trully needed. Unfortunately there is currently NO way to achieve this. And there is an impression like there was because it 'sometimes' works like that.

Another issue that I also mentioned about with the 'Size' within the bucket. Expected to me seems that the two: 'OrderBy' and 'Size X' would result in 'Top' or ('Bottom' for ascending) X documents as already seen in 'current' bucket (not of what there was in the 'root' bucket).
Currently behaviour is totally surprising - as when only updating 'size' - some previously seen and still falling in the 'Top/Bottom X' range values can disappear (or be replaced by other ones) because the 'Top' ('Bottom') X is calculated for the 'root' bucket instead.

Again - if it was possible to control and apply the 'SortBy' to a selected bucket (even as a non-default option) - this 'Size' related issue woud also become 'solvable' in aggregations.

What do you think?

@lukeelmers

Thanks!
Lukasz.

Hey @formiaczek,

Thanks for taking the time to write up your thoughts on this. I agree that the current behavior of sorting on first bucket is not intuitive, and in particularly confusing in some scenarios (especially stacked bar charts).

There are a few fundamental problems which makes this a non-trivial issue to solve:

First, there's the question of UX and what a user would reasonably expect (sorting each bucket on values of its subaggs vs sorting based on the entire set of data overall). I like your idea of making a user-selectable option for how the sorting is applied. This is similar to what someone suggests in the comments on the Github issue. Removing orderBy for subbuckets is also an interesting idea for a short-term solution, but my main concern with that would be that it's a very widespread change to address a very narrow issue... for example sorting alphabetically still works as expected because there is no bucket-to-bucket variance in the results.

Second, there's the technical issue of how we model the data before rendering the visualization. The current way the data are modeled simply doesn't preserve the sort order from Elasticsearch by the time it makes its way to a chart, as demonstrated here. Changing this to preserve sort order would be an update that affects all point series visualizations, so it is something that will require an extra level of planning and care when implemented, since it's not a one-off bug fix.

I'd encourage you to share any more thoughts you have on the Github issue (or just :+1: the original comment). That's the best way to make sure the right folks see it, and watching +1s helps us to best prioritize. In the meantime, I'll ping the team that works in this area to make sure it's still being monitored.

Cheers,

Luke

Thank you @lukeelmers for replying and I surely will add this to github discussion too!

Now why I really care is that I have a use-case where my 'outer' bucket is collecting all the events by a specific 'term' (i.e. an 'id'), and split-series (inner bucket) gives me a grouping by another term (e.g. 'event'). Then I want to stack-up duration of these 'events', sorted by timestamp. All in all, this sorting by timestamp doesn't (often) really work for me in this scenario.

Regards,
Lukasz.