Advanced filtering for aggregations

Allard_Poldermans · December 30, 2019, 8:13am

I have an index containing OrderLines in Elastic Search which contains customer name, product name, price.

My objective is to create a visualization (datalist) and aggregate by Customer Name, and find the totals for all customers that have bought product X, but have not bought product Y.

The final display should be something like :

Customer Name Total Value Product A value

Cust1 1000 (all bought products) 100 (only product X)
.....

Is this possible to do in Kibana ?

flash1293 · December 30, 2019, 10:19am

Hi, I don't think this will be possible with regular visualizations as you need features that aren't available via the UI (e.g. the bucket selector aggregation). Your best bet would be a raw elasticsearch queries, but there is currently no way to visualize the results of those as a table without writing a custom visualization plugin.

If you are also fine with a horizontal bar visualization, you can look into vega as it allows you to specify your custom query. In your case you can use a query similar to this one:

GET /your_index/_search
{
  "size": 0,
  "aggs": {
    "customers": {
      // aggregate by customer field
      "terms": {
        "field": "customer",
        "size": 10
      },
      "aggs": {
      // sum up all totals  
        "all_total": {
          "sum": {
            "field": "total"
          }
        },
      // sum up just the totals of product x (by doing a filter aggregation first)
        "x_total": {
          "filter": {
            "term": {
              "product": "X"
            }
          },
          "aggs": {
            "sum": {
              "sum": {
                "field": "total"
              }
            }
          }
        },
       // get count of Y product purchases for the current customer (to filter the bucket later)
        "hasY": {
          "filter": {
            "term": {
              "product": "Y"
            }
          },
          "aggs": {
            "count": {
              "value_count": {
                "field": "product"
              }
            }
          }
        },
       // get count of X product purchases for the current customer (to filter the bucket later)
        "hasX": {
          "filter": {
            "term": {
              "product": "X"
            }
          },
          "aggs": {
            "count": {
              "value_count": {
                "field": "product"
              }
            }
          }
        },
       // actually filter down the bucket by the script containing the condition
        "bucket_filter": {
          "bucket_selector": {
            "buckets_path": {
              "xCount": "hasX.count",
              "yCount": "hasY.count"
            },
            "script": "params.xCount > 0 && params.yCount == 0"
          }
        }
      }
    }
  }
}

It makes sense to first build the query in the dev tools and when it produces the right results you can plug it into a vega spec. Feel free to ask questions if you get stuck on this route.

Allard_Poldermans · December 30, 2019, 3:44pm

Thank you very much for your reply and effort.

In the meantime I found another way to get a bit closer :

In Kibana you can create a datatable and aggregate by CustomerName, and then create multiple metrics. For each metric (of type SUM) you can specify in the advanced settings a filter, similar to :

{ "script" : "if (doc['Product.Name.keyword'].value =='Product X' ) { return _value; } return 0" }

I also do the same for Product Y.

This will give me the dataset with a column for the total sale, the sale for Product X and the sale for Product Y (which sometimes is 0).

I can then export the results to Excel and filter out the rows where the sale for Product Y is 0.

I still wonder if there is a way to hide a row completely when there is no sale for Product Y.

flash1293 · December 30, 2019, 4:11pm

It's great you found a way that works for your use case - I actually typed out this solution half-way and then deleted it because I missed the part with filtering out the rows that don't have Product Y. Instead of advanced JSON input you could also use scripted fields which are defined on an index pattern level.

I still wonder if there is a way to hide a row completely when there is no sale for Product Y.

This is what the bucket_selector is doing on Elasticsearch side - unfortunately there is currently no way I know of to do it on the Kibana side that's supported by the standard table visualization.

system · January 27, 2020, 4:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Create a table counting the number of buckets having same count (count pipeline ?) Kibana	4	256	November 23, 2020
How to filter visualization based on result of aggregate? Kibana	3	3594	May 2, 2018
Visualize data without filters Kibana	4	202	August 1, 2023
Filtering on aggregations Elasticsearch	2	88	July 24, 2024
Using Filters in Visualizations Kibana	3	348	December 6, 2018

Advanced filtering for aggregations

Customer Name Total Value Product A value

Related topics