Split series containing top N terms sorted by numeric field

Hello,

I need a bit of help to build a split series visualization.

I have an time-series index containing temperature and humidity in different locations. Here is a sample entry:

{
    "@timestamp": "2019-02-04T00:00:00",
    "location": "Rome",
    "temperature": 26,
    "humidity": 43
}

I want to build a line chart that displays for a time range a graph of humidity over time for each of top 5 hottest locations (so practically get max temperature for all locations in the time range and get top 5). Is this achievable in Kibana?

Thank you.

I think so. Which visual builder are you using?

If it's a line chart ->

metric: max or average humidity, whichever you're looking for
Terms aggregation => temperature ordered by custom metric (avg temperature) => date histogram

Hi Jon,

Thanks for the answer.

I'm using Kibana and it's a line chart indeed. What you are describing is almost what I was able to come up with myself and that is:

  • Metric: average "humidity"
  • Buckets:
    1. Split Series: Terms aggregation on "location" order by custom metric = max "temperature"
    2. X-Axis: Date histogram on "@timestamp"

Unfortunately this doesn't produce the expected line chart because the data from the index is in multiple shards. In the mapping phase of the request each shard returns its own top 5 (which doesn't necessarily contain entries for the final top 5) and in the reduce phase the data from all shards is put together, the top 5 is identified but not all historical data for each of the top 5 locations is available.

Hopefully there is another solution for this.

Thanks.

Hmm, that's a tough one. Is bumping the shard_size an option? https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size_3

image

This works indeed but with the expected drawback: it takes much longer to produce a result since it needs to bring everything from each shard to the coordinating node. And in production might actually cause the whole system to fail due to memory pressure.

Also, if I set a lower value I won't be guaranteed to get the historical value for all top N.

But this is a great trick, thanks for pointing it out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.