Hi,
I need to solve the following problem.
I have an index with some operations performed by users. I need to detect the users that performs two or more update operations on a same element (value) but only if those operations where performed at least with one hour of difference.
So, I thought in a date_histogram aggregation
GET index_lab/_search
{
"size": 0,
"query": {
"bool": {"must": [{"terms": {"event.module": ["labs-app"]}},
{"terms": {"composed_action": ["update"]}},
{"terms": {"appuser": ["tester_1","tester_2"]}},
{"range": {
"@timestamp": {
"gte": "now-24h",
"lte": "now"
}
}}
]}},
"aggs": {
"intervals": {
"date_histogram": {
"field": "@timestamp",
"interval": "hour"
}, "aggs": {
"results": {
"terms": {
"field": "value"
}
}
}
}
}
}
Which returns something like this
"key_as_string" : "2020-05-08T10:00:00.000Z",
"key" : 1588932000000,
"doc_count" : 67,
"scree" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 55,
"buckets" : [
{
"key" : "150713-5d",
"doc_count" : 2
},
{
"key" : "140249-e5",
"doc_count" : 2
},
{
"key" : "115801-46",
"doc_count" : 1
},
{
"key" : "116089-88",
"doc_count" : 1
}
]
}
},
{
"key_as_string" : "2020-05-08T11:00:00.000Z",
"key" : 1588935600000,
"doc_count" : 43,
"scree" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 32,
"buckets" : [
{
"key" : "147973-7b",
"doc_count" : 2
},
{
"key" : "140249-e5",
"doc_count" : 1
},
{
"key" : "140250-86",
"doc_count" : 1
}
]
}
}
How can I detect which values (keys of the second aggregation) are in more than one parent bucket? (in the example value 140249-e5)
I can solve this processing the query with python, but I want to know if it is possible to solve only with a aggregation
Thanks!