Elasticsearch Engineer I - aggregations problem/question

During exam preparation I run across the following:

I wanted to see requests per country:

GET logs_server*/_search
{
  "size": 0,
  "aggs": {
    "top_countries": {
      "terms": {
        "field": "geoip.country_name.keyword",
        "size": 400
      }
    }
  }
}

which worked as expected.
And now I want to get/calculate the percentages per country so I tried:

GET logs_server*/_search
{
  "size": 0,
  "aggs": {
    "top_countries": {
      "significant_terms": {
        "field": "geoip.country_name.keyword",
        "percentage": {},
        "size": 400
      }
    }
  }
}

Gets me all the numbers but does not calculate percentages. Next try:

GET logs_server*/_search
{
  "size": 0,
  "aggs": {
    "all_countries": {
      "value_count": {
        "field": "geoip.country_name.keyword"
      }
    },
    "by_countries": {
      "terms": {
        "field": "geoip.country_name.keyword",
        "size": 400
      },
      "aggs": {
        "percentage": {
          "bucket_script": {
            "buckets_path": {
              "all": "all_countries",
              "country": "by_countries"
            },
            "script": "country /all * 100"
          }
        }
      }
    }
  }
}

Produces:

"reason" : "No aggregation found for path [all_countries]"

So, how could I calculate the requests (doc_count) percentage per country?

Hey @dyjo

I don't think this can be done with a pipeline aggregation. You'd have to compute the percentages on the client side.

By the way, pipeline aggregations are not currently listed on the Elastic Certified Engineer exam objectives, so I would not worry about getting a question at this level on your exam.

If you don't want to calculate it on the client side, you could solve it with 2 requests. The main issue is that documents can be indexed in between the 2 calls, generating results that are not 100% precise.

# the result here is 1751476
GET logs_server*/_count

# add the above result to the params
GET logs_server*/_search
{
  "size": 0,
  "aggs": {
    "top_countries": {
      "terms": {
        "field": "geoip.country_name.keyword",
        "size": 400
      },
      "aggs": {
        "percentage": {
          "bucket_script": {
            "buckets_path": {
              "count": "_count"
            },
            "script": {
              "source": "(params.count / params.total) * 100",
              "params": {
                "total": 1751476
              }
            }
          }
        }
      }
    }
  }
}

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.