Scripted metrics aggregation based on complicated terms buckets as an entry data


(Dmitriy Kulichkin) #1

I have an index containing various mobile platform feature-related events, for example Bluetooth accessibility status switching on/off. To simplify let's say the appropriate type's schema looks like this:

"BLUETOOTH_STATUS": {
  "properties": {
    "device_id" : { "type" : "string", "index" : "not_analyzed" },
    "time"      : { "type" : "date" },        
    "os"        : { "type" : "string", "index" : "not_analyzed" },
    "status"    : { "type": "boolean" }
  }
}

I.e. every device can populate multiple on/off events for the Bluetooth while using the application. Eventually I need to show a number of the ones with the feature status enabled. Currently I've not found any better way to approach it but making the following combination of the terms/top_hits aggregations:

"BLUETOOTH_STATUS": {
  "filter": {
    "term": {"_type": "BLUETOOTH_STATUS"}
  },
  "aggs": {    
    "by_device": {
      "terms": { "field": "device_id", "size": 0 },
      "aggs": {
        "max_date": {
          "top_hits": {
            "size": 1,
            "sort": [ { "time": { "order": "desc" } } ],
            "_source": { "include": ["status"]}
          }
        }
      }
    }    
  }
}

This ends up with having a big payload delivered and needed to be handled subsequently on the client with a last feature status per device_id and accounting only positive ones:

const bluetoothEnabledCount = by_device.buckets.filter((device) => 
   device.max_date.hits.hits[0]._source.status ).length;

I.e. having a few thousand unique users (device_id) in the system I always need to deal with a payload of these few thousand device id buckets.

Is there are any better way to come up with this metric right from the server at once? At least avoid doing it on the client? I started to look towards the scripted_metric but didn't find how to provide my terms/top_hits buckets as an init_script data.

So whether it's possible to approach the problem in a more effective way rather than I'm doing it now?


(Mark Harwood) #2

See this similar "devices with last status of X" question here: Find servers whose last logged event was "error"


(system) #3