I have the following documents in Elasticsearch. Each document represents an order from a user of an e-commerce shop. Each document has the items that were bought (in an array) together with the user id.
I want to calculate the average number of items per order for each user, count how many users have X orders and then plot it in a bar plot or a histogram.
More details below:
{
"user_id": 1,
"bought_items" : ["Ball", "Pen"]
},
{
"user_id": 2,
"bought_items" : ["SomeItem1", "SomeItem2", "SomeItem3", "SomeItem4"]
},
{
"user_id": 1,
"bought_items" : ["Car", "Motorcycle", "Truck", "Yacht"]
},
{
"user_id": 3,
"bought_items": ["Iterm1", "Item2", "Item3"]
}
First, I want to per user get the average items in an order. Here:
user 1 => 3
user 2 => 4
user 3 => 3
And then make a bar plot (or a histogram) where the bars represents how many users have on average bought X items. Here it would be:
2 users have bought 3 items on average.
1 user has bought 4 items on average.
I imagine the result of the query would look like something like this:
[{
"item_quantity": 3,
"count" : 2
},
{
"item_quantity": 4,
"count" :
}]
How could I accomplish that?