Sorting and bucket_selector not working on pipeline aggregation

Hi Team,

I have an index "order" that has order data with seller detail, status, and order value.

{
sId:1,
name: "John",
orderId:123
gov:100,
rtsStatus:"BRECHED"
},
{
sId:1,
name: "John",
orderId:456
gov:150,
rtsStatus:"NOT_BRECHED"
}
{
sId:2,
name: "Tia",
orderId:343,
gov:200,
rtsStatus:"BRECHED"
}

Now I want to retrieve data seller-wise and status with sorting on each field and should return data that has a particular status.

{
sId:1,
name: "John",
orderCount:2
breachedCount:1
orderValue:100
},
{
sId:2,
name: "Tia",
orderCount:1,
breachedCount:1,
orderValue:200
} 

The query I prepared is,

"aggs": {
  "orderCount": {
    "terms": {
      "field": "sId.keyword",
      "size": 10,
      "min_doc_count": 1,
      "shard_min_doc_count": 0,
      "show_term_doc_count_error": false,
      "order": [
        {
          "breachCount": "asc"
        },
        {
          "_key": "asc"
        }
      ]
    },
    "aggregations": {
      "Name": {
        "terms": {
          "field": "Name.keyword"
        }
      },
      "breachCount": {
        "filter": {
          "term": {
            "rtsStatus.keyword": {
              "value": "BREACHED",
              "boost": 1.0
            }
          }
        },
        "aggs": {
          "orderValue": {
            "sum": {
              "field": "gov"
            }
          }
        }
      },
      "breachPercent": {
        "bucket_script": {
          "buckets_path": {
            "orderCount": "_count",
            "breachCount": "breachCount>_count"
          },
          "script": {
            "source": "if(params.orderCount>0) {params.breachCount/params.orderCount*100} else {0.0}",
            "lang": "painless"
          },
          "gap_policy": "skip"
        }
      },
      "sales_bucket_filter": {
        "bucket_selector": {
          "buckets_path": {
            "totalSales": "breachCount>_count"
          },
          "script": "params.totalSales > 0"
        }
      }
    }
  }
}
}

Here the issue is,

  1. I m not able to sort buckets on the name field,
    -- error is, pipeline aggregation cannot be used to sort the buckets.
  2. I want to exclude the bucket that has breachCount 0.
    -- A final result I want should have a seller list that has at least 1 breached order. I didn't use query to filter on status BREACHED, coz I need to count total order and breached percentage.
    -- I have used bucket_selector but it is not working when sorting is breachCount ASC. This gives an empty bucket list thought data is exist. I think this is because after sorting first 10 buckets have breachCount 0 and it is selecting from the first 10 as I set "size": 10. But it should return the first 10 with brechCount >0.

The order of terms aggregation should be _key, _count, or other single value metrics sub-aggregation. You can not use other fields of documents. It is not obvious what the value representing the bucket from the field. You have to define how to aggregate the field to sort using the field.

It couldn't. That is the way how bucket selector aggregation works. Bucket selector aggregation just filter the returned buckets. You have to think about another way.

Hi @Tomo_M, thanks for your reply. Regarding sorting, If I use name instead of id in parent aggregation, then it will solve my problem But I can not do this as names can be duplicated for different sellers and it will lead to incorrect order data.

Can you please suggest any approach for name sub-aggregation that should return key-value(should return the name from the first bucket), not a bucket list as always there will be only one bucket and then I can use in order field?

If name is a keyword field, you can use top-metrics aggregation.

This is the perfect solution I was looking for but I m using elastic 7.6.0 where this top_metrics is not supported. Is there any alternative to achieve the same in the old version?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.