Terms Aggregation, but over a time period?

Hi everyone! I am trying to create a query that searches and identifies when a specific term reaches a threshold over a given period of time. My first thought is to construct a search that will, for example, bring back the total number of ice cream cones sold over 3 hours, aggregated by flavor. What would be the best route for this? I can see a terms aggregation could help:

GET /_search
{
    "aggs" : {
        "genres" : {
            "terms" : { "field" : "genre" } 
        }
    }
}

But is it possible to include the time in the search? I have access to the @timestamp field, so would it be possible to combine a basic must query (where i can reference @timestamp) and the term aggregation? My ideal output would be something like this - but over a 3 hour period:

...
"aggregations" : {
"cone_flavors" : {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets" : [
{
"key" : "vanilla",
"doc_count" : 100
},
{
"key" : "chocolate",
"doc_count" : 83
},
{
"key" : "twist",
"doc_count" : 43
}
]
}
}
}

Thank you all for your help, and apologies if the answer is somewhere in the docs but I overlooked :cold_face:

yes, you can combine a query and and aggregation like this

{
  "query" : { "range" : { "@timestamp" : { "gte" : "now-3d", "lt" :  "now" } } },
  "aggs" : {
        "genres" : {
            "terms" : { "field" : "genre" } 
        }
    }
}

This will only aggregate on the documents that match the query.

1 Like

Hi Spinscale, thanks for your response. I'm still having trouble getting the response to return within the time limit.

I am getting back everything from X index from the last 3 days (using the example), and then get back the buckets containing all the genres with their total counts. However, if I would change the @timestamp to be 1 minute, the buckets still contain the same count for the genre aggregation.
These two queries return the same results, almost as if the queries were operating separately:

GET /blah-blah*/_search
{
  "query": {
    "range": {
      "@timestamp": {
        "from": "now-1M",
        "to": "now"
      }
    }
  },
  "aggs": {
    "genres": {
      "terms": {
        "field": "genre"
      }
    }
  }
}

 GET /blah-blah*/_search
    {
      "query": {
        "range": {
          "@timestamp": {
            "from": "now-3d",
            "to": "now"
          }
        }
      },
      "aggs": {
        "genres": {
          "terms": {
            "field": "genre"
          }
        }
      }
    }

Results:
....
"aggregations" : {
"genres" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Rock",
"doc_count" : 30746
},
{
"key" : "Bluegrass",
"doc_count" : 20477
},
{
"key" : "Jazz",
"doc_count" : 3221
},
{
"key" : "Country",
"doc_count" : 1918
},
...

hey, can you share the full response of both queries?

Hi Spinscale, I am so sorry for the delayed response. So, as it turns out, your query totally worked - I was searching over the past month of data (1M), not minute in my above query. And, since our Elasticsearch cluster was so new (like, 2 days old), it returned the same amount of data. Thank you for your help!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.