Hello guys,
I am facing up a problem for which I have been unable to find an answer for weeks now.
Let's say I have an index named "users", and a type "profiles", where each document represents a user profile for an application.
I keep track of some important "events", that are stored and indexed in "profiles" type as nested documents.
The mappings of "users" index:
PUT users
{
"mappings" : {
"profiles": {
"properties": {
"firstname": {
"type": "string"
},
"lastname": {
"type": "string"
},
"events": {
"type": "nested",
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"type": {
"type": "byte"
}
}
}
}
}
}
}
Each new event coming up for a profile will be stored as a new event, with its date and type.
Now the use case: I want to know, from 2015-05-01 to 2015-06-01, how much profiles did trigger how much events ?
Example of documents containing test data:
POST users/profiles/USER1
{
"firstname": "Carlos",
"lastname": "Rodriguez",
"events": [
{
"date": "2015-04-13 17:41:21",
"type": 1
},
{
"date": "2015-05-02 14:14:03",
"type": 2
},
{
"date": "2015-05-09 09:37:57",
"type": 2
}
]
}
POST users/profiles/USER2
{
"firstname": "Frank",
"lastname": "Bar",
"events": [
{
"date": "2015-05-19 11:24:19",
"type": 2
}
]
}
POST users/profiles/USER3
{
"firstname": "Jason",
"lastname": "Foo",
"events": [
{
"date": "2015-05-14 13:21:14",
"type": 1
}
]
}
The following aggregation shoud be a good start to play arround with the problem:
GET users/profiles/_search
{
"size": 0,
"aggs": {
"nested-agg": {
"nested": {
"path": "events"
},
"aggs": {
"filtered": {
"range": {
"field": "events.date",
"ranges": [
{
"from": "2015-05-01 00:00:00",
"to": "2015-06-01 00:00:00"
}
]
},
"aggs": {
"by-type": {
"terms": {
"field": "events.type"
},
"aggs": {
"reversed": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
And with the (truncated) result, I am able to know that:
2 profiles have triggered a total of 3x "Event Type 2"
1 profile has triggered a total of 1x "Event Type 1"
"by-type": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 2,
"doc_count": 3,
"reversed": {
"doc_count": 2
}
},
{
"key": 1,
"doc_count": 1,
"reversed": {
"doc_count": 1
}
}
]
}
I need a result like this:
1 profile has triggered 2x "Event Type 2" (USER1)
1 profile has triggered 1x "Event Type 2" (USER2)
1 profile has triggered 1x "Event Type 1" (USER3)
Is it possible to achieve this without using a script ?
I know I should somehow index for each profiles how many time they triggered each event, but the need
to have a date filter makes it, to me, impossible.