Filter out buckets in an aggregated query


(Claudio) #1

Hi, I have an aggregation query to get the top_hits documents filtered by terms of a multi value filed, and I want to filter out the buckets that don't match the filtered values.

Let me explain the situation with an example:

Having this documents:

id    create_time    tags 
1     7/1/15         a
2     7/2/15         b,d
3     7/3/15         a,c
4     7/3/15         b
5     7/3/15         e

I want to get the latest documents of some tags, for example for "a" and "b" the result should be:

a -> 3  7/3/15  a,c
b -> 4  7/3/15  b

To to this I have the following query:

{
  "size":0,
  "query":{
    "filtered":{
      "query":{ "match_all":{} },
      "filter":{ "terms":{ "tags":[ "a", "b" ] }
      }
    }
  },
  "aggs":{
    "newest-event-query":{
      "terms":{ "field":"tags", "size":0 },
      "aggs":{
        "newest-event":{
          "top_hits":{ "size":1, "sort":[ { "create_time":{ "order":"desc" } } ] }
        }
      }
    }
  }
}

The problem is that this query returns buckets for the tags "c" and "d", because there are documents with tag "a" or "b" which have "c" or "d" as well. This is the result:

a -> 3  7/3/15  a,c
b -> 4  7/3/15  b
c -> 3  7/3/15  a,c
d -> 2  7/2/15  b,d

Is there any way to filter out the buckets "c" and "d", and just get buckets for the tags in the filter?

Thanks, Claudio


(Colin Goodheart-Smithe) #2

you could use the include parameter on the terms aggregation to only allow terms which are in your filter. See https://www.elastic.co/guide/en/elasticsearch/reference/1.6/search-aggregations-bucket-terms-aggregation.html#_filtering_values for more details.

HTH


(Claudio) #3

Thanks!, the include parameter works perfect.

Just keep it as reference the complete example is:

{
  "size":0,
  "query":{
    "filtered":{
      "query":{ "match_all":{} },
      "filter":{ "terms":{ "tags":[ "a", "b" ] }
      }
    }
  },
  "aggs":{
    "newest-event-query":{
      "terms":{ "field":"tags", "size":0, "include" : "a|b" },
      "aggs":{
        "newest-event":{
          "top_hits":{ "size":1, "sort":[ { "create_time":{ "order":"desc" } } ] }
        }
      }
    }
  }
}

(system) #4