I have a message index with a 'date' and 'channelId' fields:
POST message/_doc
{
"channelId": "channel_twitter",
"date": "2018-03-08T14:01:23.385Z"
}
I need to create an aggregation that calculates the average number of messages per day of the week per channel. For example - for twitter - an average of 5 messages on Mondays, 10 messages on Tuesdays etc.
Looking at the day of week example in the Date Histogram documentation, I managed to get the total number of messages per channel per day of week:
You could use pipeline aggregations for this. Pipeline aggregations are aggregations that work on the result of other aggregations.
In this case, I would first create a weekly histogram inside of your total_messages_per_day_of_week aggregation. This will create a bucket per week of data, and the number of buckets in this histogram corresponds to the total number of Mondays, Tuesdays etc.
Next, I would use a bucket_script aggregation to divide the total messages per week day by the number of weeks of data. The bucket_script aggregation can calculate a metric based on the outputs of other aggregations by referring to those aggregations by their paths. It has access to special paths like _count (which can be used to get the total number of messages) and _bucket_count (to get the number of weeks). Dividing one by the other gets you what you want.
Putting it all together, the request would look like this:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.