I have an index containing OrderLines in Elastic Search which contains customer name, product name, price.
My objective is to create a visualization (datalist) and aggregate by Customer Name, and find the totals for all customers that have bought product X, but have not bought product Y.
Hi, I don't think this will be possible with regular visualizations as you need features that aren't available via the UI (e.g. the bucket selector aggregation). Your best bet would be a raw elasticsearch queries, but there is currently no way to visualize the results of those as a table without writing a custom visualization plugin.
If you are also fine with a horizontal bar visualization, you can look into vega as it allows you to specify your custom query. In your case you can use a query similar to this one:
GET /your_index/_search
{
"size": 0,
"aggs": {
"customers": {
// aggregate by customer field
"terms": {
"field": "customer",
"size": 10
},
"aggs": {
// sum up all totals
"all_total": {
"sum": {
"field": "total"
}
},
// sum up just the totals of product x (by doing a filter aggregation first)
"x_total": {
"filter": {
"term": {
"product": "X"
}
},
"aggs": {
"sum": {
"sum": {
"field": "total"
}
}
}
},
// get count of Y product purchases for the current customer (to filter the bucket later)
"hasY": {
"filter": {
"term": {
"product": "Y"
}
},
"aggs": {
"count": {
"value_count": {
"field": "product"
}
}
}
},
// get count of X product purchases for the current customer (to filter the bucket later)
"hasX": {
"filter": {
"term": {
"product": "X"
}
},
"aggs": {
"count": {
"value_count": {
"field": "product"
}
}
}
},
// actually filter down the bucket by the script containing the condition
"bucket_filter": {
"bucket_selector": {
"buckets_path": {
"xCount": "hasX.count",
"yCount": "hasY.count"
},
"script": "params.xCount > 0 && params.yCount == 0"
}
}
}
}
}
}
It makes sense to first build the query in the dev tools and when it produces the right results you can plug it into a vega spec. Feel free to ask questions if you get stuck on this route.
In the meantime I found another way to get a bit closer :
In Kibana you can create a datatable and aggregate by CustomerName, and then create multiple metrics. For each metric (of type SUM) you can specify in the advanced settings a filter, similar to :
It's great you found a way that works for your use case - I actually typed out this solution half-way and then deleted it because I missed the part with filtering out the rows that don't have Product Y. Instead of advanced JSON input you could also use scripted fields which are defined on an index pattern level.
I still wonder if there is a way to hide a row completely when there is no sale for Product Y.
This is what the bucket_selector is doing on Elasticsearch side - unfortunately there is currently no way I know of to do it on the Kibana side that's supported by the standard table visualization.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.