Filter by minimum value per parent (or term)


(Manuel Brunner) #1

Use Case

  • Hotels (~some thousand) with Offers (~some million)
  • Hotels and offers each have a lot of data where filters are applied
  • Sorting needs to be done by minimum offer price or some other ratings
  • Need to retrieve the cheapest offer per hotel (while matching all
    hotel and offer filters)

I have modeled it so that hotels are parents of offers.

Approaches:

  1. Use a has_child filter to get all hotels which have at least one
    matching offer. I think I could even get the minimum price with a
    top_children query and custom_score, but I still don't know which offer was
    the cheapest.
  2. Use a has_parent filter to get all offers, which match offer and
    hotel filters.

With approach 2 I could filter the cheapest offer in the application, but I
would have to retrieve all matching offers from elasticsearch and would
lose all pagination and sorting capabilities and lose quite some
performance, if there are a lot of matches.

Here is a simplified example. Offers 1 and 2 are returned, but I need to
return only the cheapest one per hotel (offer 2). Is there any possibility
to achieve this?

#!/bin/sh

curl -XPUT 'http://localhost:9200/test' -d '{
"mappings": {
"offer": {
"_parent": {
"type": "hotel"
}
},
"hotel": {
"properties": {
"country": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}'

echo

curl -XPUT 'http://localhost:9200/test/hotel/1' -d '{
"country" : "AT"
}'

echo

curl -XPUT 'http://localhost:9200/test/hotel/2' -d '{
"country" : "DE"
}'

echo

curl -XPUT 'http://localhost:9200/test/offer/1?parent=1' -d '{
"arrival": "2015-01-01",
"price": "100"
}'

echo

curl -XPUT 'http://localhost:9200/test/offer/2?parent=1' -d '{
"arrival" : "2015-01-01",
"price" : "90"
}'

echo

curl -XPUT 'http://localhost:9200/test/offer/3?parent=1' -d '{
"arrival" : "2015-01-02",
"price" : "80"
}'

echo

curl -XPUT 'http://localhost:9200/test/offer/4?parent=2' -d '{
"arrival": "2015-01-01",
"price": "100"
}'

echo
sleep 1

curl -XGET 'http://localhost:9200/test/offer/_search' -d '{
"query": {
"filtered": {
"filter": {
"bool": {
"must": {
"term": {
"arrival": "2015-01-01"
}
},
"must": {
"has_parent": {
"type": "hotel",
"filter": {
"term" : {"country" : "AT" }

                    }
                  }
              }
            }
        }
    }
}

}'
echo

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2ecb2c8-70db-452d-aa08-89d0a6c2ae3c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Manuel Brunner) #2

With aggregations in the new elasticsearch version I came up with a
solution to get the cheapest offer per hotel:

However, it returns just the bucket key and the value:
"buckets" : [ {
"key" : "h1",
"doc_count" : 2,
"min_price" : {
"value" : 90.0
}
}

I cannot see which of the offers has this minimum price.

Does anyone have a solution for this problem? I'm not sure if Elasticsearch
is really a good fit for our use case and I might want to have a look at
MongoDB.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b142006-171a-4348-9f85-315afd7b9e65%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3