Why Composite aggregation shows Empty buckets first

Hi,

I have a index with 9 Million documents. I performed Composite aggregation on nested field and I also mentioned doc_count sorting. Here I lost a lot of perfect buckets.
Here is my aggregation query with size 2000

"aggs": {
  "location": {
      "nested": {
          "path": "resume.profile.locations"
      },
      "aggs": {
          "location_s": {
              "composite": {
                  "size": 2000,
                  "sources": [
                      {
                          "state": {
                              "terms": {
                                  "field": "resume.profile.locations.stateCanonical.keyword"
                              }
                          }
                      },
                      {
                          "state_code": {
                              "terms": {
                                  "field": "resume.profile.locations.stateCode.keyword"
                              }
                          }
                      },
                      {
                          "city": {
                              "terms": {
                                  "field": "resume.profile.locations.cityCanonical.keyword"
                              }
                          }
                      },
                      {
                          "postal_code": {
                              "terms": {
                                  "field": "resume.profile.locations.postalCode.keyword"
                              }
                          }
                      },
                      {
                          "country": {
                              "terms": {
                                  "field": "resume.profile.locations.countryCanonical.keyword"
                              }
                          }
                      },
                      {
                          "address_type": {
                              "terms": {
                                  "field": "resume.profile.locations.addressType.keyword"
                              }
                          }
                      },
                      {
                          "confidence_score": {
                              "terms": {
                                  "field": "resume.profile.locations.confidenceScore"
                              }
                          }
                      }
                  ]
              },
              "aggs": {
                  "doc_count_sort": {
                      "bucket_sort": {
                          "sort": [
                              {
                                  "_count": "desc"
                              }
                          ]
                      }
                  }
              }
          }
      }
  }
}

I got Empty buckets first like,

{
            "key": {
              "state": "",
              "state_code": "",
              "city": "",
              "postal_code": "00729",
              "country": "",
              "address_type": "present",
              "confidence_score": 1
            },
            "doc_count": 7
          },
          {
            "key": {
              "state": "",
              "state_code": "",
              "city": "",
              "postal_code": "00841",
              "country": "",
              "address_type": "present",
              "confidence_score": 1
            },
            "doc_count": 7
          },
          {
            "key": {
              "state": "",
              "state_code": "",
              "city": "",
              "postal_code": "00962",
              "country": "",
              "address_type": "present",
              "confidence_score": 1
            },
            "doc_count": 7
          }

Actually those postalcode have city and state but, the elasticsearch is not showing..

When I increased the size from 2000 to 20000
I got buckets like,

{
            "key": {
              "state": "alabama",
              "state_code": "AL",
              "city": "huntsville",
              "postal_code": "35810",
              "country": "united states of america",
              "address_type": "Present",
              "confidence_score": 1
            },
            "doc_count": 3107
          },
          {
            "key": {
              "state": "alabama",
              "state_code": "AL",
              "city": "montgomery",
              "postal_code": "36116",
              "country": "united states of america",
              "address_type": "Present",
              "confidence_score": 1
            },
            "doc_count": 2728
          },
          {
            "key": {
              "state": "alabama",
              "state_code": "AL",
              "city": "montgomery",
              "postal_code": "36117",
              "country": "united states of america",
              "address_type": "Present",
              "confidence_score": 1
            },
            "doc_count": 2101
          }

Can anybody tell me what is the issue here. How can I get complete data buckets with less size without changing the search query because I need to perform on whole data.

Thank you,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.