Getting unique list from array of nested objects

Hi guys,

I am using couchdb with elasticsearch and up until recently, I've been a
very happy and satisfied user. Recently there was an issue that I could
not figure how to solve using pure ElasticSearch. I wanted to ask whether
such a thing is even possible. So here it goes:

I have a two types of documents:

  1. product
  2. productionOrder

A productionOrder may contain one or many products. So obviously if I
wanted to retrieve a list of products that are contained within
productionOrders (I need to retrieve products that have productionOrders
being made) I'll get a list that will most-probably have duplicate
products.

First off, I needed to devise a query to obtain the products found in
productionOrders. I read through the docs and it seems like I have to map
my products within productionOrders using the type "nested". After this I
can do a query on products using the has_parent filter with the parent_type
set to "productionOrder". Is this correct?
*
*
Next off, I need to remove the duplicate products from the resultset. Now,
I have no idea how to do this purely in elasticsearch. I need a pure
elasticsearch query/filter option because I use it for
paging/sorting/filtering my kendoUI grids. I'm using pyes. How would I
remove the duplicates, if the only thing that differentiates a product from
another is their id attribute?

*
*

--

Hi Mark,

In order to use the has_parent filter you need to use the _parent
field in your mapping instead of the nested type.
When using the parent child capabilities of elasticsearch you need to
determine what is your parent type and what is
your child type. From what I understand from your email it seems that
productionOrder is the parent. Take a look at about the _parent field:

If you configured the _parent field properly you can use the
top_children, has_child and has_parent queries and filters.
I just uploaded my slides about document relations: (this also covers
nested objects and parent child in ES):

Elastic search doesn't support removing duplicates from the result set
(also knows as result grouping) yet. You can achieve psuedo result
grouping by using faceting on productOrderId field (the field in the
product mapping referring to the productOrder it belongs to) in
combination with subsequent search requests.

Martijn

On 8 November 2012 08:50, Mark Huang zhenghao12@gmail.com wrote:

Hi guys,

I am using couchdb with elasticsearch and up until recently, I've been a
very happy and satisfied user. Recently there was an issue that I could not
figure how to solve using pure Elasticsearch. I wanted to ask whether such
a thing is even possible. So here it goes:

I have a two types of documents:

  1. product
  2. productionOrder

A productionOrder may contain one or many products. So obviously if I
wanted to retrieve a list of products that are contained within
productionOrders (I need to retrieve products that have productionOrders
being made) I'll get a list that will most-probably have duplicate products.

First off, I needed to devise a query to obtain the products found in
productionOrders. I read through the docs and it seems like I have to map
my products within productionOrders using the type "nested". After this I
can do a query on products using the has_parent filter with the parent_type
set to "productionOrder". Is this correct?

Next off, I need to remove the duplicate products from the resultset. Now,
I have no idea how to do this purely in elasticsearch. I need a pure
elasticsearch query/filter option because I use it for
paging/sorting/filtering my kendoUI grids. I'm using pyes. How would I
remove the duplicates, if the only thing that differentiates a product from
another is their id attribute?

--

--
Met vriendelijke groet,

Martijn van Groningen

--

1 Like