Mean aggregation over cadinality aggregation, over previous composite aggregation

My use case require to obtain an average over previous cardinality aggregation, this aggregations works over the buckets generated by a composite aggregation "calendar histogram" that breaks the timeline in calendar days, due to I'm working over a long time span.

I don't know whats fits best to solve this case using Elasticsearch capabilities, sub-aggregation, pipeline-aggregation ... , the cardinality column is generated over the calendar buckets using sub-aggregation, I would like work over this output to obtanin an average.

aggregations: {
    "daily_product_names_for_specific_shop": {
        "composite": {
            "size": 10000,
            "sources": [
              {
                "date": {
                  "date_histogram": {
                    "field": "@timestamp",
                    "calendar_interval": "1d",
                    "format": "iso8601"
                  }
                }
              }
            ]
        },
        # Sub-aggregation over composite buckets
        "aggregations": {
           "different_products_amount": {
               "cardinality": {
                    "field": "ticket.p_name"
                }
           }
        # Maybe some stuff
        }
    }
}

In my opinion, average bucket aggregation of pipeline aggregation fits your purpose. If it doesn't work, please ask again.

I have tried:

aggregations: {
    "daily_product_names_for_specific_shop": {
        "composite": {
            "size": 10000,
            "sources": [
              {
                "date": {
                  "date_histogram": {
                    "field": "@timestamp",
                    "calendar_interval": "1d",
                    "format": "iso8601"
                  }
                }
              }
            ]
        },
        # Sub-aggregation over composite buckets
        "aggregations": {
           "different_products_amount": {
               "cardinality": {
                    "field": "ticket.p_name"
                }
           },
           # The avg
           "mean": {
                "avg_bucket": {
                    "buckets_path": "different_products_amount",
                    "gap_policy": "skip",
                    "format": "#,##0.00;(#,##0.00)"
              }
        }
    }
}

But gets:

Validation Failed: 1: The first aggregation in buckets_path must be a multi-bucket aggregation for aggregation [mean] found :org.elasticsearch.search.aggregations.metrics.CardinalityAggregationBuilder for buckets path: different_products_amount;

Also I have already tried:

aggregations: {
    "daily_product_names_for_specific_shop": {
        "composite": {
            "size": 10000,
            "sources": [
              {
                "date": {
                  "date_histogram": {
                    "field": "@timestamp",
                    "calendar_interval": "1d",
                    "format": "iso8601"
                  }
                }
              }
            ]
        },
        # Sub-aggregation over composite buckets
        "aggregations": {
           "different_products_amount": {
               "cardinality": {
                    "field": "ticket.p_name"
                }
           }
        }
    },
   "mean": {
       "avg_bucket": {
                    "buckets_path": "daily_product_names_for_specific_shop>different_products_amount",
                    "gap_policy": "skip",
                    "format": "#,##0.00;(#,##0.00)"
       }
}

But the composite acts:

Validation Failed: 1: The first aggregation in buckets_path must be a multi-bucket aggregation for aggregation [mean] found :org.elasticsearch.search.aggregations.bucket.composite.CompositeAggregationBuilder for buckets path: daily_product_names_for_specific_shop>different_products_amount;

The latter query looks correct grammarticaly. Sorry, I didn't know pipeline aggregation can not be used with composit aggregation.

I found an issue explaining the reason.

Are there any destructive problems by using only datetime histogram aggregation without composite aggregation and discard unnecessary outputs?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.