I want to compare the daily average of a metric (the frequency of words appearing in texts) to the value of a specific day. This is during a week. My goal is to check whether there's a spike. If the last day is way higher than the daily average, I'd trigger an alarm.

So from my input in Elasticsearch I compute the daily average during the week and find out the value for the last day of that week.

For getting the daily average for the week, I simply cut a week's worth of data using a `range`

query on `date`

field, so *all* my available data is the given week. I compute the sum and divide by 7 for a daily average.

For getting the last day's value, I did a terms aggregation on the `date`

field with descending order and size 1

The whole output is as follows. Here you can see words "rama0" and "rama1" with their corresponding frequencies.

```
{
"aggregations" : {
"the_keywords" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "rama0",
"doc_count" : 4200,
"the_last_day" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 3600,
"buckets" : [
{
"key" : 1580169600000,
"key_as_string" : "2020-01-28T00:00:00.000Z",
"doc_count" : 600,
"the_last_day_frequency" : {
"value" : 3000.0
}
}
]
},
"the_weekly_sum" : {
"value" : 21000.0
},
"the_daily_average" : {
"value" : 3000.0
}
},
{
"key" : "rama1",
"doc_count" : 4200,
"the_last_day" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 3600,
"buckets" : [
{
"key" : 1580169600000,
"key_as_string" : "2020-01-28T00:00:00.000Z",
"doc_count" : 600,
"the_last_day_frequency" : {
"value" : 3000.0
}
}
]
},
"the_weekly_sum" : {
"value" : 21000.0
},
"the_daily_average" : {
"value" : 3000.0
}
},
[...]
]
}
}
}
```

Now I have `the_daily_average`

in a high level of the output, and `the_last_day_frequency`

in the single-element buckets list in `the_last_day`

aggregation. I cannot use a `bucket_script`

to compare those, because I cannot refer to a single bucket (if I place the script outside `the_last_day`

aggregation) and I cannot refer to higher-level aggregations if I place the script inside `the_last_day`

.

IMO the reasonable thing to do would be to put the script outside the aggregation and use a `buckets_path`

using the `<AGG_NAME><MULTIBUCKET_KEY>`

syntax mentioned in the docs, but I have tried `"var1": "the_last_day[1580169600000]>the_last_day_frequency"`

and variations (hardcoding first until it works), but I haven't been able to refer to a particular bucket.

My ultimate goal is to have a list of keywords for which the last day frequency greatly exceeds the daily average.

For anyone interested, my current query is as follows. Notice that the part I'm struggling with is commented out.

```
body='{
"query": {
"range": {
"date": {
"gte": "START",
"lte": "END"
}
}
},
"aggs": {
"the_keywords": {
"terms": {
"field": "keyword",
"size": 100
},
"aggs": {
"the_weekly_sum": {
"sum": {
"field": "frequency"
}
},
"the_daily_average" : {
"bucket_script": {
"buckets_path": {
"weekly_sum": "the_weekly_sum"
},
"script": {
"inline": "return params.weekly_sum / 7"
}
}
},
"the_last_day": {
"terms": {
"field": "date",
"size": 1,
"order": {"_key": "desc"}
},
"aggs": {
"the_last_day_frequency": {
"sum": {
"field": "frequency"
}
}
}
}/*,
"the_spike": {
"bucket_script": {
"buckets_path": {
"last_day_frequency": "the_last_day>the_last_day_frequency",
"daily_average": "the_daily_average"
},
"script": {
"inline": "return last_day_frequency / daily_average"
}
}
}*/
}
}
}
}'
```