This is pretty readily achievable.
Suppose I set up my index like this:
PUT algorithms
{
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
},
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"time": {
"type": "date"
}
}
}
}
and throw events in for the algorithms
POST _bulk
{ "index" : { "_index" : "algorithms"} }
{ "name" : "toto", "time": "2019-11-13T22:15:30Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "titi", "time": "2019-11-13T22:15:32Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "tutu", "time": "2019-11-13T22:15:34Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "tata", "time": "2019-11-13T22:15:36Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "toto", "time": "2019-11-13T22:15:37Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "titi", "time": "2019-11-13T22:15:38Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "tutu", "time": "2019-11-13T22:15:39Z" }
{ "index" : { "_index" : "algorithms"} }
{ "name" : "tata", "time": "2019-11-13T22:15:43Z" }
Yes, I'm only putting in two events per algo, but it wouldn't matter how many additional events were in there, this will work.
I'm going to use aggregations here. First, I'm going to use a terms aggregation on the algorithm name to silo each algorithm's events into a bucket.
Then, within each of those buckets, I'm going to do a stats aggregation, part of which is min and max (and a string representation, too).
POST algorithms/_search
{
"size": 0,
"aggs": {
"algo": {
"terms": {
"field": "name",
"size": 10
},
"aggs": {
"stats": {
"stats": {
"field": "time"
}
}
}
}
}
}
(You can see there is no query
in the body; it is implicitly match_all
, and I've provided "size": 0
because I'm not interested in having "hits
" - the documents that result from the search. I only want "aggs"
.)
The results:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 8,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"algo" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "tata",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 1.573683336E12,
"max" : 1.573683343E12,
"avg" : 1.5736833395E12,
"sum" : 3.147366679E12,
"min_as_string" : "2019-11-13T22:15:36.000Z",
"max_as_string" : "2019-11-13T22:15:43.000Z",
"avg_as_string" : "2019-11-13T22:15:39.500Z",
"sum_as_string" : "2069-09-25T20:31:19.000Z"
}
},
{
"key" : "titi",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 1.573683332E12,
"max" : 1.573683338E12,
"avg" : 1.573683335E12,
"sum" : 3.14736667E12,
"min_as_string" : "2019-11-13T22:15:32.000Z",
"max_as_string" : "2019-11-13T22:15:38.000Z",
"avg_as_string" : "2019-11-13T22:15:35.000Z",
"sum_as_string" : "2069-09-25T20:31:10.000Z"
}
},
{
"key" : "toto",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 1.57368333E12,
"max" : 1.573683337E12,
"avg" : 1.5736833335E12,
"sum" : 3.147366667E12,
"min_as_string" : "2019-11-13T22:15:30.000Z",
"max_as_string" : "2019-11-13T22:15:37.000Z",
"avg_as_string" : "2019-11-13T22:15:33.500Z",
"sum_as_string" : "2069-09-25T20:31:07.000Z"
}
},
{
"key" : "tutu",
"doc_count" : 2,
"stats" : {
"count" : 2,
"min" : 1.573683334E12,
"max" : 1.573683339E12,
"avg" : 1.5736833365E12,
"sum" : 3.147366673E12,
"min_as_string" : "2019-11-13T22:15:34.000Z",
"max_as_string" : "2019-11-13T22:15:39.000Z",
"avg_as_string" : "2019-11-13T22:15:36.500Z",
"sum_as_string" : "2069-09-25T20:31:13.000Z"
}
}
]
}
}
}
I wonder how you will isolate the results from multiple runs. Maybe you have a unique process ID for each run, or maybe you have a time window you can rely on to contain all of the events from one run and no events from any other run, either of which are criteria you would add to a query
to limit the documents used in the aggregations.