Nested object retrieval (not querying, not filtering, just yet)

I am trying to figure out how to retrieve data from nested objects as if it
was a separate index/type.
According to documentation ES index the nested docs separately, but how do
I retrieve it.

My data is coming in as deeply nested and rich document, I have simplified
it to bare minimum to get to the point:

Sample data:
{
"AssetId": "4b7b5c27-7cdd-40f0-bc01-24186bc108cd",
"AssetTitle": "Inside Llewyn Davis",
"CastMembers": [
{
"AssetCastCrewMemberId": "6765678",
"CastCrewName": "Oscar Isaac",
"CastCrewRole": "Actor",
},
{
"AssetCastCrewMemberId": "15514452",
"CastCrewName": "Carey Mulligan",
"CastCrewRole": "Actor",
}
]
}

Desired output:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.2038321,
"hits": [
{
"_index": "metadata",
"_type": "asset",
"_id": "6765678",
"_score": 2.2038321,
"_source": {
"AssetCastCrewMemberId": "6765678",
"CastCrewName": "Oscar Isaac",
"CastCrewRole": "Actor-1",
}
},
{
"_index": "metadata",
"_type": "asset",
"_id": "15514452",
"_score": 2.2038321,
"_source": {
"AssetCastCrewMemberId": "15514452",
"CastCrewName": "Carey Mulligan",
"CastCrewRole": "Actor-2",
}
},
]
}

When I tried fields selection:
{
"fields": [
"CastMembers.AssetCastCrewMemberId",
"CastMembers.CastCrewName",
"CastMembers.CastCrewRole"
]
}

I got:
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "metadata",
"_type": "asset",
"_id": "4b7b5c27-7cdd-40f0-bc01-24186bc108cd",
"_score": 1,
"fields": {
"CastMembers.CastCrewName": [
"Oscar Isaac",
"Carey Mulligan",
],
"CastMembers.CastCrewRole": [
"Actor-1",
"Actor-2",
],
"CastMembers.AssetCastCrewMemberId": [
"6765678",
"15514452",
]
}
}
]
}
}

Quite not the way I need it as association between Oscar Isaac and his role
Actor-1 is his is is not reliable.

Advice or even ideas are greatly appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

First of all, it appears that you are not using nested objects, but inner
objects. This article will hopefully help understanding the difference:

True nested documents (or perhaps parent/child) appear to be what you are
looking for. Keep in mind, with nested documents, your queries will return
the correct top-level document, but all the nested children will be
returned as well. Return matching nested inner objects per hit · Issue #3022 · elastic/elasticsearch · GitHub

Cheers,

Ivan

On Tue, Nov 12, 2013 at 12:37 PM, Vladimir Khazin <
vladimir.khazin@icssolutions.ca> wrote:

I am trying to figure out how to retrieve data from nested objects as if
it was a separate index/type.
According to documentation ES index the nested docs separately, but how do
I retrieve it.

My data is coming in as deeply nested and rich document, I have simplified
it to bare minimum to get to the point:

Sample data:
{
"AssetId": "4b7b5c27-7cdd-40f0-bc01-24186bc108cd",
"AssetTitle": "Inside Llewyn Davis",
"CastMembers": [
{
"AssetCastCrewMemberId": "6765678",
"CastCrewName": "Oscar Isaac",
"CastCrewRole": "Actor",
},
{
"AssetCastCrewMemberId": "15514452",
"CastCrewName": "Carey Mulligan",
"CastCrewRole": "Actor",
}
]
}

Desired output:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.2038321,
"hits": [
{
"_index": "metadata",
"_type": "asset",
"_id": "6765678",
"_score": 2.2038321,
"_source": {
"AssetCastCrewMemberId": "6765678",
"CastCrewName": "Oscar Isaac",
"CastCrewRole": "Actor-1",
}
},
{
"_index": "metadata",
"_type": "asset",
"_id": "15514452",
"_score": 2.2038321,
"_source": {
"AssetCastCrewMemberId": "15514452",
"CastCrewName": "Carey Mulligan",
"CastCrewRole": "Actor-2",
}
},
]
}

When I tried fields selection:
{
"fields": [
"CastMembers.AssetCastCrewMemberId",
"CastMembers.CastCrewName",
"CastMembers.CastCrewRole"
]
}

I got:
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "metadata",
"_type": "asset",
"_id": "4b7b5c27-7cdd-40f0-bc01-24186bc108cd",
"_score": 1,
"fields": {
"CastMembers.CastCrewName": [
"Oscar Isaac",
"Carey Mulligan",
],
"CastMembers.CastCrewRole": [
"Actor-1",
"Actor-2",
],
"CastMembers.AssetCastCrewMemberId": [
"6765678",
"15514452",
]
}
}
]
}
}

Quite not the way I need it as association between Oscar Isaac and his
role Actor-1 is his is is not reliable.

Advice or even ideas are greatly appreciated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you, Ivan, for the comment!

After reading the difference between inner, nested, and parent/child
mapping it seems to me that there is no (practical) way to achieve the
desired query output.
Other than creating parent/child mapping between asset and actor and
indexing actors separately from assets.

Is there any better alternative to the direction I am about to take:

  1. Index asset as one type
  2. Index actor as another type completely disconnected from the asset
    and handle the sync by means outside of elastic search

Is there any equivalent to map reduce, triggers, or couchbase view
implementation in elastic search to reshape the data - not just to
map/remap the data?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.