Bucket selector in sub aggregation or cardinality aggregation

Hi guys

I have this query

GET /my_index3/_search 
{
"size": 0,
  "aggs": {
    "num1": {
      "terms": {
        "field": "num1.keyword",
        "order" : { "_count" : "desc" }
      },
      "aggs": {
        "count_of_distinct_suffix": {
          "cardinality" :{
             "field" : "suffix.keyword"
          }
        }
      }
    }
  } 
}

That has this output

          "key" : "1563866656878888",
      "doc_count" : 42,
      "count_of_distinct_suffix" : {
        "value" : 2
      }
    },
    {
      "key" : "1563866656871111",
      "doc_count" : 40,
      "count_of_distinct_suffix" : {
        "value" : 2
      }
    },
    {
      "key" : "1563867854325555",
      "doc_count" : 36,
      "count_of_distinct_suffix" : {
        "value" : 1
      }
    },
    {
      "key" : "1563867854323333",
      "doc_count" : 12,
      "count_of_distinct_suffix" : {
        "value" : 1
      }
    },

I want to see only the results which have "count_of_distinct_suffix" : { "value" : 2 }

I'm thinking about bucket selector aggregation but it's impossible to add it into the cardinality aggs...

         "aggs": {
        "my_filter": {
           "bucket_selector": {
              "buckets_path": {
                 "the_doc_count": "_count"
              },
              "script": "params.doc_count == 2"
           }
        }
     }

It gives me the following error: Aggregator [count_of_distinct_suffix] of type [cardinality] cannot accept sub-aggregations

Do you guys have any idea to solve it?

Thank you very much for any help in advance !!

It looks like you nested the bucket_selector aggregation inside the cardinality aggregation. You should nest it inside the terms aggregation instead. The following works:

GET /my_index3/_search
{
  "size": 0,
  "aggs": {
    "num1": {
      "terms": {
        "field": "num1.keyword",
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "count_of_distinct_suffix": {
          "cardinality": {
            "field": "suffix.keyword"
          }
        },
        "my_filter": {
          "bucket_selector": {
            "buckets_path": {
              "count_of_distinct_suffix": "count_of_distinct_suffix"
            },
            "script": "params.count_of_distinct_suffix == 2"
          }
        }
      }
    }
  }
}

(By the way, screenshots of requests and responses are not a good format for sharing code snippets on this forum. Please share those as plain text, formatted with the </> button. It makes it much easier for folks to help you :slightly_smiling_face: )

1 Like

Thank you very much for your help!

May I ask you another question please? :smiley:

How would you show the same result but only if the cardinality of two suffix happens within 72 hours?

  "range": {
    "within72hours": {
      "gte": "now-72h"
    }
  }

Yes, a range query would do the trick.

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html

Thank you for your reply!

I mean that num1 has only 1 suffix and if the same num1 didn't get second suffix within some time e.g one hour it wouldn't show this bucket even if the count_of_distinct_suffix == 2.

I think the range query just show me the frame of the buckets in defined range or am I mistaken?

The query will limit the scope of the aggregations. That means, you will only see buckets for documents in the last 72 hours. So, only if the count_of_distinct_suffix == 2 for the last 72 hours will a bucket be returned.