Nested Documents and Term Facets

Hi everybody,

I need some help with term_stats facet when dealing with nested documents.
Assume I have a document mapping like following:

{
"a" : {"type" : "integer" },
"b" : {
"type": "nested",
"include_in_root": true,
"c" : { "type" : "integer" },
}
}

I want to know if I can run the following facet:

{
"query": { "match_all": {} },
"facets": {
"facet1": {
"terms_stats": {
"key_field" : "a",
"value_field": "b.c"
},
"nested": "b"
}
}
}

I am not getting any results using this facet. This could be because since
I have marked it as "nested", its trying to search "a" within the nested
document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested facet?

Thanks a lot in advance!
Vinay

--

I don't believe that is possible when you actually say nested (as
opposed to just an inner object).
Are you saving _source? I think you might be able to facet on the source.

-Paul

On 12/17/2012 6:15 PM, revdev wrote:

Hi everybody,

I need some help with term_stats facet when dealing with nested documents.
Assume I have a document mapping like following:

||
{
"a":{"type":"integer"},
"b":{
"type":"nested",
"include_in_root":true,
"c":{"type":"integer"},
}
}

I want to know if I can run the following facet:

||
{
"query":{"match_all":{}},
"facets":{
"facet1":{
"terms_stats":{
"key_field":"a",
"value_field":"b.c"
},
"nested":"b"
}
}
}

I am not getting any results using this facet. This could be because
since I have marked it as "nested", its trying to search "a" within
the nested document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested facet?

Thanks a lot in advance!
Vinay

--

--

Okay. Thanks for clearing that. I am not storing _source to save space.
I thought storing it in root will allow this type of facet query since it
creates separate documents for nested objects.

On Monday, December 17, 2012 9:02:36 PM UTC-8, P Hill wrote:

I don't believe that is possible when you actually say nested (as
opposed to just an inner object).
Are you saving _source? I think you might be able to facet on the source.

-Paul

On 12/17/2012 6:15 PM, revdev wrote:

Hi everybody,

I need some help with term_stats facet when dealing with nested
documents.
Assume I have a document mapping like following:

{
"a" : {"type" : "integer" },
"b" : {
"type": "nested",
"include_in_root": true,
"c" : { "type" : "integer" },
}
}

I want to know if I can run the following facet:

{
"query": { "match_all": {} },
"facets": {
"facet1": {
"terms_stats": {
"key_field" : "a",
"value_field": "b.c"
},
"nested": "b"
}
}
}

I am not getting any results using this facet. This could be because
since I have marked it as "nested", its trying to search "a" within the
nested document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested facet?

Thanks a lot in advance!
Vinay

--

--

On 12/17/2012 9:33 PM, revdev wrote:

Okay. Thanks for clearing that. I am not storing _source to save space.
I thought storing it in root will allow this type of facet query since
it creates separate documents for nested objects.

store in root makes it ALSO be stored as an inner object in the root
object, but then you'd need to "store: true" it there in order to get
the value of the field for faceting (or maybe you could use the source,
but does that require a script in the facet?) Then you do your facet on
the root, but be careful, the document in the root will include all
the nested objects, not just the matching ones.
This drove me mad for a couple of weeks.

I ended up "inverting" one of my queries, (1) make the nested into
children; (2) use a query on the children that narrows the children and
also any appropriate queries about the parent in the recently added
has_parent. There is no equivalent to has_parent for nested objects.
I've learned that has_children are slower than nested_queries, due to a
special storage trick for nested documents and their outer documents,
but my queries were fast enough.

Another method I used in another object was to repeat a field or two
from the parent in the child (or nested) so that I didn't need to join
parents with children (or outer with nested).
Buy another Terrabyte drive they cost $100 or less, if you are worried
about storage :slight_smile:

To me this didn't feel right, because I was used to trying for highly
normalized data. But we are not building a DB, we are building an index
to data and some of that data is in the index. Even in DB we repeat data
all the time when we ask for various additional indices. In a database
we just don't see those "denormalized" values resulting from building
various compound and secondary indices out of various fields, but there
really is something out there representing that data again.

-Paul

--

Yes, with "include_in_root" and without "nested", it should
work: https://groups.google.com/d/topic/elasticsearch/pjoNmosdCPs/discussion · GitHub

On Tuesday, December 18, 2012 12:33:57 AM UTC-5, revdev wrote:

Okay. Thanks for clearing that. I am not storing _source to save space.
I thought storing it in root will allow this type of facet query since it
creates separate documents for nested objects.

On Monday, December 17, 2012 9:02:36 PM UTC-8, P Hill wrote:

I don't believe that is possible when you actually say nested (as
opposed to just an inner object).
Are you saving _source? I think you might be able to facet on the source.

-Paul

On 12/17/2012 6:15 PM, revdev wrote:

Hi everybody,

I need some help with term_stats facet when dealing with nested
documents.
Assume I have a document mapping like following:

{
"a" : {"type" : "integer" },
"b" : {
"type": "nested",
"include_in_root": true,
"c" : { "type" : "integer" },
}
}

I want to know if I can run the following facet:

{
"query": { "match_all": {} },
"facets": {
"facet1": {
"terms_stats": {
"key_field" : "a",
"value_field": "b.c"
},
"nested": "b"
}
}
}

I am not getting any results using this facet. This could be because
since I have marked it as "nested", its trying to search "a" within the
nested document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested facet?

Thanks a lot in advance!
Vinay

--

--

Igor, P Hill, Thanks for the suggestions!

I ended up duplicating the field on which I want to calculate "term_stats"
and thats working perfectly for now. It's a bit of more storage but that
works since our dataset is not so massive at this point.
Thanks again for detailed answers!

Vinay

On Tuesday, December 18, 2012 12:12:45 PM UTC-8, Igor Motov wrote:

Yes, with "include_in_root" and without "nested", it should work:
https://groups.google.com/d/topic/elasticsearch/pjoNmosdCPs/discussion · GitHub

On Tuesday, December 18, 2012 12:33:57 AM UTC-5, revdev wrote:

Okay. Thanks for clearing that. I am not storing _source to save space.
I thought storing it in root will allow this type of facet query since it
creates separate documents for nested objects.

On Monday, December 17, 2012 9:02:36 PM UTC-8, P Hill wrote:

I don't believe that is possible when you actually say nested (as
opposed to just an inner object).
Are you saving _source? I think you might be able to facet on the
source.

-Paul

On 12/17/2012 6:15 PM, revdev wrote:

Hi everybody,

I need some help with term_stats facet when dealing with nested
documents.
Assume I have a document mapping like following:

{
"a" : {"type" : "integer" },
"b" : {
"type": "nested",
"include_in_root": true,
"c" : { "type" : "integer" },
}
}

I want to know if I can run the following facet:

{
"query": { "match_all": {} },
"facets": {
"facet1": {
"terms_stats": {
"key_field" : "a",
"value_field": "b.c"
},
"nested": "b"
}
}
}

I am not getting any results using this facet. This could be because
since I have marked it as "nested", its trying to search "a" within the
nested document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested facet?

Thanks a lot in advance!
Vinay

--

--

Hi Vinay,

That's what "include_in_root" is basically doing, so that you don't have to
duplicate the field yourself.

Igor

On Tuesday, December 18, 2012 4:43:02 PM UTC-5, revdev wrote:

Igor, P Hill, Thanks for the suggestions!

I ended up duplicating the field on which I want to calculate "term_stats"
and thats working perfectly for now. It's a bit of more storage but that
works since our dataset is not so massive at this point.
Thanks again for detailed answers!

Vinay

On Tuesday, December 18, 2012 12:12:45 PM UTC-8, Igor Motov wrote:

Yes, with "include_in_root" and without "nested", it should work:
https://groups.google.com/d/topic/elasticsearch/pjoNmosdCPs/discussion · GitHub

On Tuesday, December 18, 2012 12:33:57 AM UTC-5, revdev wrote:

Okay. Thanks for clearing that. I am not storing _source to save space.
I thought storing it in root will allow this type of facet query since
it creates separate documents for nested objects.

On Monday, December 17, 2012 9:02:36 PM UTC-8, P Hill wrote:

I don't believe that is possible when you actually say nested (as
opposed to just an inner object).
Are you saving _source? I think you might be able to facet on the
source.

-Paul

On 12/17/2012 6:15 PM, revdev wrote:

Hi everybody,

I need some help with term_stats facet when dealing with nested
documents.
Assume I have a document mapping like following:

{
"a" : {"type" : "integer" },
"b" : {
"type": "nested",
"include_in_root": true,
"c" : { "type" : "integer" },
}
}

I want to know if I can run the following facet:

{
"query": { "match_all": {} },
"facets": {
"facet1": {
"terms_stats": {
"key_field" : "a",
"value_field": "b.c"
},
"nested": "b"
}
}
}

I am not getting any results using this facet. This could be because
since I have marked it as "nested", its trying to search "a" within the
nested document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested facet?

Thanks a lot in advance!
Vinay

--

--

The thing is my nested document is not a single object but its an array of
objects which is why I used "nested" property since I want to find only
array objects which match a particular filter.
I tried performing facets by using "include_in_root" and omitting "nested"
property in facet query, but then it would return results by collecting
stats from all objects of the array rather than just one or two which match
the facet filter. Hope I was able to explain myself :slight_smile:

On Thu, Dec 20, 2012 at 7:31 AM, Igor Motov imotov@gmail.com wrote:

Hi Vinay,

That's what "include_in_root" is basically doing, so that you don't have
to duplicate the field yourself.

Igor

On Tuesday, December 18, 2012 4:43:02 PM UTC-5, revdev wrote:

Igor, P Hill, Thanks for the suggestions!

I ended up duplicating the field on which I want to calculate
"term_stats" and thats working perfectly for now. It's a bit of more
storage but that works since our dataset is not so massive at this point.
Thanks again for detailed answers!

Vinay

On Tuesday, December 18, 2012 12:12:45 PM UTC-8, Igor Motov wrote:

Yes, with "include_in_root" and without "nested", it should work:
https://gist.github.com/**4331528 https://gist.github.com/4331528

On Tuesday, December 18, 2012 12:33:57 AM UTC-5, revdev wrote:

Okay. Thanks for clearing that. I am not storing _source to save space.
I thought storing it in root will allow this type of facet query since
it creates separate documents for nested objects.

On Monday, December 17, 2012 9:02:36 PM UTC-8, P Hill wrote:

I don't believe that is possible when you actually say nested (as
opposed to just an inner object).
Are you saving _source? I think you might be able to facet on the
source.

-Paul

On 12/17/2012 6:15 PM, revdev wrote:

Hi everybody,

I need some help with term_stats facet when dealing with nested
documents.
Assume I have a document mapping like following:

{
"a" : {"type" : "integer" },
"b" : {
"type": "nested",
"include_in_root": true,
"c" : { "type" : "integer" },
}
}

I want to know if I can run the following facet:

{
"query": { "match_all": {} },
"facets": {
"facet1": {
"terms_stats": {
"key_field" : "a",
"value_field": "b.c"
},
"nested": "b"
}
}
}

I am not getting any results using this facet. This could be because
since I have marked it as "nested", its trying to search "a" within the
nested document "b" rather than looking into root object.
Any idea, how I can reference root object "a" when doing a nested
facet?

Thanks a lot in advance!
Vinay

--

--

--