Nesting more than one level of child parent

I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from that book.

However, when indexing more than two-levels there is a problem with has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a "has_parent" query on the lowest level (characters) or a has_child query on the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so will reside together with his parent (author), but a character gets indexed to a shard based on the hash of the book id, which does not necessarily complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not necessarily reside in the same shard (kind of crippling the whole child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need for complex queries such as "what authors wrote books with female characters" for example.

I mad a gist showing the problem, at:

If you're indexing mulit-level parent child documents you need to use the
routing query string option in addition to the parent query string
option. The routing will always contain the id of the
first hierarchy level (author in your case). This way all books from the
same author and characters of these books always reside on the same shard.

On 3 April 2013 11:16, eranid eranid@gmail.com wrote:

I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from
that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need
for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
Elasticsearch Parent-Child-Grandchild problem · GitHub

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Great, Thanks a lot.

On Wed, Apr 3, 2013 at 1:01 PM, Martijn v Groningen [via Elasticsearch
Users] ml-node+s115913n4032827h13@n3.nabble.com wrote:

If you're indexing mulit-level parent child documents you need to use the
routing query string option in addition to the parent query string
option. The routing will always contain the id of the
first hierarchy level (author in your case). This way all books from the
same author and characters of these books always reside on the same shard.

On 3 April 2013 11:16, eranid <[hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=0

wrote:

I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from
that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query
on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets
indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need
for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
Elasticsearch Parent-Child-Grandchild problem · GitHub

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=1
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=2
.
For more options, visit https://groups.google.com/groups/opt_out.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032827.html
To unsubscribe from Nesting more than one level of child parent, click
herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4032822&code=ZXJhbmlkQGdtYWlsLmNvbXw0MDMyODIyfDY0MzA1NjE1MA==
.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml

In the case of books an characters, I could suggest to nest characters in
book documents, but I'm guessing your real application isn't about books :wink:

@Martijn, maybe you could add the routing fix to the docs where parent
parameters is explained?
On Apr 3, 2013 2:47 PM, "eranid" eranid@gmail.com wrote:

Great, Thanks a lot.

On Wed, Apr 3, 2013 at 1:01 PM, Martijn v Groningen [via Elasticsearch
Users] <[hidden email]http://user/SendEmail.jtp?type=node&node=4032831&i=0

wrote:

If you're indexing mulit-level parent child documents you need to use the
routing query string option in addition to the parent query string
option. The routing will always contain the id of the
first hierarchy level (author in your case). This way all books from the
same author and characters of these books always reside on the same shard.

On 3 April 2013 11:16, eranid <[hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=0

wrote:

I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from
that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query
on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets
indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need
for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
Elasticsearch Parent-Child-Grandchild problem · GitHub

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=1
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=2
.
For more options, visit https://groups.google.com/groups/opt_out.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032827.html
To unsubscribe from Nesting more than one level of child parent, click
here.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nesting more than one level of child
parenthttp://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032831.html
Sent from the Elasticsearch Users mailing list archivehttp://elasticsearch-users.115913.n3.nabble.com/at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

@Jaap Makes sense, I'll update the docs.

On 3 April 2013 20:41, Jaap Taal jaap@q42.nl wrote:

In the case of books an characters, I could suggest to nest characters in
book documents, but I'm guessing your real application isn't about books :wink:

@Martijn, maybe you could add the routing fix to the docs where parent
parameters is explained?
On Apr 3, 2013 2:47 PM, "eranid" eranid@gmail.com wrote:

Great, Thanks a lot.

On Wed, Apr 3, 2013 at 1:01 PM, Martijn v Groningen [via Elasticsearch
Users] <[hidden email]http://user/SendEmail.jtp?type=node&node=4032831&i=0

wrote:

If you're indexing mulit-level parent child documents you need to use
the routing query string option in addition to the parent query string
option. The routing will always contain the id of the
first hierarchy level (author in your case). This way all books from the
same author and characters of these books always reside on the same shard.

On 3 April 2013 11:16, eranid <[hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=0

wrote:

I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from
that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child
query on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets
indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real
need for
complex queries such as "what authors wrote books with female
characters"
for example.

I mad a gist showing the problem, at:
Elasticsearch Parent-Child-Grandchild problem · GitHub

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=1
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]http://user/SendEmail.jtp?type=node&node=4032827&i=2
.
For more options, visit https://groups.google.com/groups/opt_out.


If you reply to this email, your message will be added to the
discussion below:

http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032827.html
To unsubscribe from Nesting more than one level of child parent, click
here.
NAMLhttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: Nesting more than one level of child
parenthttp://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032831.html
Sent from the Elasticsearch Users mailing list archivehttp://elasticsearch-users.115913.n3.nabble.com/at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.