Imagine the following use case:
We work at Stark Airlines and our marketing team wants to segment our passengers in order to give them discounts or gift cards. They decide that they want two sets of passengers:
- Passengers that fly at least 3 times per week
- Passenger who have flown at least once but who have not flown for two weeks
With this they can make different marketing campaigns for our passengers!
So, in elastic search we have a trip
index that represents a ticket bought by a passenger:
{
"_index" : "trip",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"total_amount" : 300,
"trip_date" : "2020/03/24 13:30:00",
"status" : "completed",
"passenger" : {
"id" : 11,
"name" : "Thiago nunes"
}
}
}
The trip
index contains a status
field that may have other values like: pending
or open
or canceled
This means that we can only take into account trips that has the completed
status (Meaning the passenger did travel).
So, with all this in mind...How would I get those two sets of passengers with elastic search?
I have been trying for a while but with no success.
What I have done until now:
- I have built a query that gets all valid trip (trips with status
completed
)
GET /trip/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"status": {
"value": "completed"
}
}
}
]
}
},
"aggs": {
"status_viagem": {
"terms": {
"field": "status.keyword"
}
}
}
}
- This query returns the following:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 200,
"relation" : "eq"
},
"max_score" : 0.18232156,
"hits" : [...]
},
"aggregations" : {
"status_viagem" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "completed",
"doc_count" : 200
}
]
}
}
}
But I am stuck and can't figure out the next step. I know that the next thing to do should create buckets with passengers and then filter them in two buckets representing our desired data sets. But I don't know how.
Can someone help?
PS.:
-
I don't exactly need this to be one single query, just a hint about how to build a query like this would be very helpful
-
THE OUPUT SHOULD BE AN ARRAY of passenger id's
-
Note: I have shortened the
trip
index for the sake of simplicity