Hi
I'm trying to analyze user retention metrics based on access logs.
My index contains documents like these:
{ "timestamp": "2015-11-01", "user_id": "1" }
{ "timestamp": "2015-11-02", "user_id": "1" }
{ "timestamp": "2015-11-02", "user_id": "2" }
{ "timestamp": "2015-11-03", "user_id": "2" }
A query
{ "aggs": { "ids": {
"terms": { "field": "user_id" },
"aggs": { "last_date": { "max": { "field": "timestamp", "format": "date" }}}
}}}
{ "buckets": [
{ "key": "1", "last_date": { "value_as_string": "2015-11-02"}},
{ "key": "2", "last_date": { "value_as_string": "2015-11-03"}}
]}
Now I need to find out how many unique "keys" I have for each day, using date histogram.
I've tried referencing a bucket like here
{
"aggs": {
"ids": {
"terms": { "field": "user_id" },
"aggs": { "last_date": { "max": { "field": "timestamp", "format": "date" }}}
},
"dates": {
"date_histogram": {
"buckets_path": "ids>last_date",
"field": "timestamp",
"interval": "day",
"format": "date"
}
}
}
}
But apparently `date_histogram` cannot accept `buckets_path` parameter.
Is there a way to apply Date histogram over a bucket? Can both these aggregations be calculated in a single query?