Aggregation: find time-window with the maximum documents

Given a time-serie of events, by day, I want to know the "hour:minute" which has the maximum of documents.

Ex. document: { "date" : "...", "event" : "search" }

I have tried to follow https://www.elastic.co/blog/implementing-a-statistical-anomaly-detector-part-1 but didnt manage to have only the datetime.

Hey,

is this what you are after?

PUT foo/bar/_bulk
{ "index" : {} }
{ "date" : "2018-04-09T12:32" }
{ "index" : {} }
{ "date" : "2018-04-09T12:33" }
{ "index" : {} }
{ "date" : "2018-04-09T12:33" }
{ "index" : {} }
{ "date" : "2018-04-09T12:33" }
{ "index" : {} }
{ "date" : "2018-04-09T12:34" }
{ "index" : {} }
{ "date" : "2018-04-09T12:34" }

GET foo/bar/_search
{
  "size": 0, 
  "aggs": {
    "date": {
      "date_histogram": {
        "field": "date",
        "interval": "1m",
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}

Hello, thanks you.

This will give me ALL the tuple "hour:minute" :confused: I want to get only the "bigger" one.

Or only with minimum counts.

GET foo/bar/_search
{
  "size": 0,
  "aggregations": {
    "timeslice": {
        "histogram": {
            "script": "doc['date'].date.getHourOfDay()",
            "interval": 1,
            "min_doc_count": 1,
            "extended_bounds": {
                "min": 0,
                "max": 23
            },
            "order": {
                "_count": "desc"
            }
        },
        "aggregations": {
          "date": {
            "date_histogram": {
              "min_doc_count": 0,
              "field": "date",
              "interval": "1m",
              "order": {
                "_count": "desc"
              }
            }
          }
        }
    }
  }
}

This gives back the busiest hour, and then in a sub bucket the highest minute. You can set a limit as well for getting back only the highest hour and minute.

I am not sure histogram supports limit or size, this is all the problem ^^

Ah oke, sorry misunderstood. Histogram does not seem to support a size for all I know as well.

How about this with painless. It is fast here also:

GET foo/bar/_search
{
  "size": 0,
  "aggregations": {
    "hour": {
        "terms": {
            "script": "doc['date'].date.getHourOfDay()",
            "min_doc_count": 1,
            "size": 1, 
            "order": {
                "_count": "desc"
            }
        },
        "aggregations": {
          "minute": {
            "terms": {
              "script": "doc['date'].date.getMinuteOfHour()",
              "min_doc_count": 1,
              "size": 1,
              "order": {
                "_count": "desc"
              }
            }
          }
        }
    }
  }
}

In ES6.X

GET foo/bar/_search
{
  "size": 0,
  "aggregations": {
    "hour": {
        "terms": {
            "script": "doc['date'].value.hourOfDay",
            "min_doc_count": 1,
            "size": 1, 
            "order": {
                "_count": "desc"
            }
        },
        "aggregations": {
          "minute": {
            "terms": {
              "script": "doc['date'].value.minuteOfHour",
              "min_doc_count": 1,
              "size": 1,
              "order": {
                "_count": "desc"
              }
            }
          }
        }
    }
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.