Querying a parent type that is also a child type for another type


(andreasch) #1

I have a case with 2 parent/child mappings where a contact is a parent of
an order and an order is a parent of a product. I am using a search query
and a has_child filter that matches everything.
Searching on contact returns all contacts as expected. However, when I run
the same search query on the orders it only returns 1 of the orders.

Is there an issue when using has_child on a parent type that is also a
child of another type?

I am providing a gist that shows the mappings, the indexed data and the 2
querying examples:

Thanks,
Andreas

--


(andreasch) #2

I forgot to mention that if I remove the parent mapping from order to
contact then the query returns the correct results.

On Tuesday, September 25, 2012 10:25:41 AM UTC-7, Andreas Christoforides
wrote:

I have a case with 2 parent/child mappings where a contact is a parent of
an order and an order is a parent of a product. I am using a search query
and a has_child filter that matches everything.
Searching on contact returns all contacts as expected. However, when I run
the same search query on the orders it only returns 1 of the orders.

Is there an issue when using has_child on a parent type that is also a
child of another type?

I am providing a gist that shows the mappings, the indexed data and the 2
querying examples:

https://gist.github.com/3783212

Thanks,
Andreas

--


(Igor Motov) #3

The has_child request can only work if parents and children are indexed in
the same shard. By default, shards for child documents are determined based
on parent ids. In your case it works fine on the contact-orders level:
orders are indexed correctly based on contact ids. But when you index
products, the parent field contains the order id, so the products are
getting indexed into wrong shards. You can switch to a single shard index
or specify contact id in the routing parameter when you index
products: https://gist.github.com/3784684

On Tuesday, September 25, 2012 5:24:52 PM UTC-4, Andreas Christoforides
wrote:

I forgot to mention that if I remove the parent mapping from order to
contact then the query returns the correct results.

On Tuesday, September 25, 2012 10:25:41 AM UTC-7, Andreas Christoforides
wrote:

I have a case with 2 parent/child mappings where a contact is a parent of
an order and an order is a parent of a product. I am using a search query
and a has_child filter that matches everything.
Searching on contact returns all contacts as expected. However, when I
run the same search query on the orders it only returns 1 of the orders.

Is there an issue when using has_child on a parent type that is also a
child of another type?

I am providing a gist that shows the mappings, the indexed data and the 2
querying examples:

https://gist.github.com/3783212

Thanks,
Andreas

--


(andreasch) #4

Igor,

Thanks for the explanation. I was mistakenly under the impression that the
appropriate routing of types with respect to parent/child relationships was
done automatically.
It seems a bit cumbersome to propagate the ID of the top parent all the way
down to each indexing operation. This will be even harder to maintain when
you have 4-5 levels in the parent/child hierarchy.

I wonder if it would be possible for ElasticSearch to determine the shard
of a parent document at the time a child is being indexed.

On Tuesday, September 25, 2012 2:53:39 PM UTC-7, Igor Motov wrote:

The has_child request can only work if parents and children are indexed in
the same shard. By default, shards for child documents are determined based
on parent ids. In your case it works fine on the contact-orders level:
orders are indexed correctly based on contact ids. But when you index
products, the parent field contains the order id, so the products are
getting indexed into wrong shards. You can switch to a single shard index
or specify contact id in the routing parameter when you index products:
https://gist.github.com/3784684

On Tuesday, September 25, 2012 5:24:52 PM UTC-4, Andreas Christoforides
wrote:

I forgot to mention that if I remove the parent mapping from order to
contact then the query returns the correct results.

On Tuesday, September 25, 2012 10:25:41 AM UTC-7, Andreas Christoforides
wrote:

I have a case with 2 parent/child mappings where a contact is a parent
of an order and an order is a parent of a product. I am using a search
query and a has_child filter that matches everything.
Searching on contact returns all contacts as expected. However, when I
run the same search query on the orders it only returns 1 of the orders.

Is there an issue when using has_child on a parent type that is also a
child of another type?

I am providing a gist that shows the mappings, the indexed data and the
2 querying examples:

https://gist.github.com/3783212

Thanks,
Andreas

--


(Igor Motov) #5

I think it should be possible to implement something like this in
elasticsearch, but I am not sure if that would be worth it considering how
much complexity and limitations it would bring. It would require for each
insert operation to lookup the parent by broadcasting to all shards and
require the parent to exist when children are indexed. I think in most
cases, it would be more efficient to just propagate top level ID or come up
with some other routing value that would be consistently applied to the top
parent and all its descendants.

On Tuesday, September 25, 2012 6:48:51 PM UTC-4, Andreas Christoforides
wrote:

Igor,

Thanks for the explanation. I was mistakenly under the impression that the
appropriate routing of types with respect to parent/child relationships was
done automatically.
It seems a bit cumbersome to propagate the ID of the top parent all the
way down to each indexing operation. This will be even harder to maintain
when you have 4-5 levels in the parent/child hierarchy.

I wonder if it would be possible for ElasticSearch to determine the shard
of a parent document at the time a child is being indexed.

On Tuesday, September 25, 2012 2:53:39 PM UTC-7, Igor Motov wrote:

The has_child request can only work if parents and children are indexed
in the same shard. By default, shards for child documents are determined
based on parent ids. In your case it works fine on the contact-orders
level: orders are indexed correctly based on contact ids. But when you
index products, the parent field contains the order id, so the products are
getting indexed into wrong shards. You can switch to a single shard index
or specify contact id in the routing parameter when you index products:
https://gist.github.com/3784684

On Tuesday, September 25, 2012 5:24:52 PM UTC-4, Andreas Christoforides
wrote:

I forgot to mention that if I remove the parent mapping from order to
contact then the query returns the correct results.

On Tuesday, September 25, 2012 10:25:41 AM UTC-7, Andreas Christoforides
wrote:

I have a case with 2 parent/child mappings where a contact is a parent
of an order and an order is a parent of a product. I am using a search
query and a has_child filter that matches everything.
Searching on contact returns all contacts as expected. However, when I
run the same search query on the orders it only returns 1 of the orders.

Is there an issue when using has_child on a parent type that is also a
child of another type?

I am providing a gist that shows the mappings, the indexed data and the
2 querying examples:

https://gist.github.com/3783212

Thanks,
Andreas

--


(system) #6