Average per day of week aggregation

Hello,

I have a message index with a 'date' and 'channelId' fields:

POST message/_doc
{
	"channelId": "channel_twitter", 
	"date": "2018-03-08T14:01:23.385Z"
}

I need to create an aggregation that calculates the average number of messages per day of the week per channel. For example - for twitter - an average of 5 messages on Mondays, 10 messages on Tuesdays etc.
Looking at the day of week example in the Date Histogram documentation, I managed to get the total number of messages per channel per day of week:

{
  "size": 0,
  "aggs": {
    "messages_per_channel": {
      "terms": {
        "field": "channelId"
      },
      "aggregations": {
        "total_messages_per_day_of_week": {
          "terms": {
            "script": {
              "lang": "painless",
              "source": "doc['date'].value.dayOfWeek"
            }
          }
        }
      }
    }
  }
}

but I can't figure out how to calculate an average based on the total number of Mondays, Tuesdays etc. since the first message.

You could use pipeline aggregations for this. Pipeline aggregations are aggregations that work on the result of other aggregations.

In this case, I would first create a weekly histogram inside of your total_messages_per_day_of_week aggregation. This will create a bucket per week of data, and the number of buckets in this histogram corresponds to the total number of Mondays, Tuesdays etc.

Next, I would use a bucket_script aggregation to divide the total messages per week day by the number of weeks of data. The bucket_script aggregation can calculate a metric based on the outputs of other aggregations by referring to those aggregations by their paths. It has access to special paths like _count (which can be used to get the total number of messages) and _bucket_count (to get the number of weeks). Dividing one by the other gets you what you want.

Putting it all together, the request would look like this:

{
  "size": 0,
  "aggs": {
    "messages_per_channel": {
      "terms": {
        "field": "channelId"
      },
      "aggs": {
        "total_messages_per_day_of_week": {
          "terms": {
            "script": {
              "lang": "painless",
              "source": "doc['date'].value.dayOfWeek"
            }
          },
          "aggs": {
            "number_of_weeks": {
              "date_histogram": {
                "field": "date",
                "interval": "week"
              }
            },
            "average_messages_per_day_of_week": {
              "bucket_script": {
                "buckets_path": {
                  "doc_count": "_count",
                  "number_of_weeks": "number_of_weeks._bucket_count"
                },
                "script": "params.doc_count / params.number_of_weeks"
              }
            }
          }
        }
      }
    }
  }
}

Works beautifully. Thanks a lot !