Proper buckets_path value for a Filters Aggregation?

Hi everyone!

I'm struggling with generating a query that fits the following scenario:

  • A single index with two types: purchase and search.
  • Given a multi-bucket metric (say, histogram), calculate for each bucket the ratio or quotient of the count of documents on both types.

So far, I've managed to put together something like this:

{
  "aggs": {
    "conversion_rate_m": {
      "filters": {
        "filters": {
          "abandon": {
            "bool": {
              "must": {
                "type": {
                  "value": "search"
                }
              }
            }
          },
          "conversion": {
            "bool": {
              "must": {
                "type": {
                  "value": "purchase"
                }
              }
            }
          }
        }
      },
      "aggs": {
        "conversion_abandon": {
          "cardinality": {
            "field": "_uid"
          }
        }
      }
    },
    "conversion_rate_m_metadata": {
      "bucket_script": {
        "buckets_path": {
          "convert": "conversion_rate_m> ?", // <-- No idea here
          "abandon": "conversion_rate_m> ?" // <-- ...nor here
        },
        "script": "convert / abandon"
      }
    }
  }
}

However, I failed to guess which are the actual values for buckets_path that would please Elasticsearch. Tried many possible combinations, up to a point where I was just guessing things up. To make matters worse, documentation is incredibly vague and obscure on the topic.

I also tried omitting the cardinality agg altogether and just make use of the doc_count value produced by each of the filters, but no avail. Stumbled upon the same problem: generating a proper bucket_path for the script to work.

So, which value should I use? Or maybe suggest a better way of going about this?

Thank you very much.

The issue here is that currently you cannot define a buckets_path that traverses through a multi bucket aggregation. Luckily for your use case you can instead use the single bucket aggregation filter rather than the multi bucket aggregation filters. Try the following (untested):

{
  "aggs": {
    "abandon": {
      "filter": {
        "bool": {
          "must": {
            "type": {
              "value": "search"
            }
          }
        }
      },
      "aggs": {
        "distinct": {
          "cardinality": {
            "field": "_uid"
          }
        }
      }
    },
    "conversion": {
      "filter": {
        "bool": {
          "must": {
            "type": {
              "value": "purchase"
            }
          }
        }
      },
      "aggs": {
        "distinct": {
          "cardinality": {
            "field": "_uid"
          }
        }
      }
    }
  },
  "conversion_rate_m_metadata": {
    "bucket_script": {
      "buckets_path": {
        "convert": "conversion>distinct",
        "abandon": "abandon>distinct"
      },
      "script": "convert / abandon"
    }
  }
}

Hope that helps

1 Like

Hey, thanks for your reply!

Yes, that was, in fact, my first approach. But lead me to a different issue. This is what I had, previously (renaming aggs after your suggestion):

 { "aggs": {
    "abandon": {
      "filter": {
        "bool": {
          "must": {
            "type": {
              "value": "search"
            }
          }
        }
      },
      "aggs": {
        "distinct": {
          "cardinality": {
            "field": "_uid"
          }
        }
      }
    },
    "conversion": {
      "filter": {
        "bool": {
          "must": {
            "type": {
              "value": "purchase"
            }
          }
        }
      },
      "aggs": {
        "distinct": {
          "cardinality": {
            "field": "_uid"
          }
        }
      }
    },
    "conversion_abandon_rate": {
      "bucket_script": {
        "buckets_path": {
          "convert": "conversion>distinct",
          "abandon": "abandon>distinct"
        },
        "script": "convert / Math.max(abandon, 1.0)"
      }
    }
  }

Running that produces a 503 response with:

{
  "response": {
    "error ": {
      "root_cause ": [],
      "type ": "reduce_search_phase_exception",
      "reason": "[reduce]",
      "phase": "fetch",
      "grouped": true,
      "caused_by": {
        "type": "class_cast_exception",
        "reason ": "org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation"
      }
    },
    "status": 503
  }
}

Which puzzled me because I'm not using any multi bucket aggregation in my query and can't see why the engine is trying to cast a filter to one.

Any thoughts?

1 Like