Multi level nested docs query filters and facets


(Rauan Maemirov) #1

Hey, all. I'm having problems with multi level nested docs. Earlier I've
tried to use two neighbor nesteds in one doc, but the concept was wrong.
Here I've got another schema, much simplified for better understanding. I'm
asking for help to make the proof of concept for multi level nested docs
(that were mentioned in docs, but I've found no examples at all for this
kind of use case).

Let's assume well-known schema mapping with authors, their books and stores
where the books could be bought:

"authors": {
    "properties": {
        "name": {"type": "string"},
        "books": {
            "type": "nested", "index": "not_analyzed",
            "properties": {
                "title": {"type": "string"},
                "published_year": {"type": "integer", "index": 

"not_analyzed"},
"options": {"type": "string", "index": "not_analyzed"},
"ordered": {"type": "integer", "index": "not_analyzed"},
"stores": {
"type": "nested", "index": "not_analyzed",
"properties": {
"store": {"type": "integer", "index":
"not_analyzed"},
"price": {"type": "float", "index":
"not_analyzed"},
"stock": {"type": "integer", "index":
"not_analyzed"}
}
}
}
}
}
}

Here I have some options, which I would like to facet on by ordered number,
e.g: how many books of fiction genre have been ordered, etc.

Some example docs:

{
"name": "John Steinbeck",
"books": [
{
"title": "The Grapes of Wrath", "published_year": 1939,
"options": ["genre#fiction", "country#american", "pulitzer#yes"],
"ordered": 4,
"stores": [
{"store": 76, "price": 15.5, "stock": 16},
{"store": 18, "price": 14.3, "stock": 54}
]
}
]
}

{
"name": "Philip Roth",
"books": [
{
"title": "American Pastoral", "published_year": 1997,
"options": ["genre#fiction", "country#american"],
"ordered": 2,
"stores": [
{"store": 26, "price": 23.4, "stock": 65},
{"store": 73, "price": 20.3, "stock": 45}
]
}
]
}

Okay, no let's try to search for books published after 1900

curl -s 'http://localhost:9200/nestedtest/authors/_search?pretty=true' -d '{
"filter": {
"nested": {
"_scope": "latest",
"path": "books",
"filter": {
"bool": {
"must": [
{"range": {"books.published_year": {"gt": 1990}}}
]
}
}
}
},
"facets": {"options": {
"terms_stats": {
"key_field": "books.options",
"value_field": "books.ordered"
},
"scope": "latest"
}
}}' && echo

(apparently you have to indicate path to field in facets if the field names
are not unique. e.g: you have another 'ordered' field in a deeper level)

Here's full test script: https://gist.github.com/2958662

Now how can I filter the results by available stock and price, i.e. filter
out books with specified price range and stock availability for all books
published after 1900? Let's say: all the books available in stock cheaper
that 20 bucks.
Where should I put second nested filter? (with respect to perfomance and
caching)
I tried to write requests with filtered query, but was totally lost in
exceptions, it was returned that nested is not supported within filtered.


(Rauan Maemirov) #2

I've solved it, please check the updated
gist. https://gist.github.com/2958662

curl -s 'http://localhost:9200/nestedtest/authors/_search?pretty=true' -d '{
"filter": {
"nested": {
"_scope": "latest",
"path": "books",
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{"range": {"books.published_year": {"gt": 1900}}}
]
}
},
"filter": {
"nested": {
"_scope": "available_in_stock",
"path": "books.stores",
"query": {
"bool": {
"must": [
{"range": {"books.stores.price": {"lt": 20.0}}}
]
}
}
}
}

            }
        }
    }
},
"facets": {"options": {
	"terms_stats": {
		"key_field": "options",
		"value_field": "ordered"
	},
	"scope": "latest"
}

}}' && echo

I wonder, if this is the correct way to filter nesteds? I'm a little bit
suspicious, if it's ok to use query instead of filter, since it's claimed
to be slower.

Le mercredi 20 juin 2012 13:56:49 UTC+6, Rauan Maemirov a écrit :


(system) #3