Counting unique root objects on nested aggregations


(Kallin Nagelberg) #1

I'm trying to build a query to aggregate on some fields in a nested
document, but instead of returning the count of the nested documents for
each aggregation, I'd like to know the number of root objects.

IE.,

I have a mapping like (from the docs):

"product" : {
"properties" : {
"resellers" : {
"type" : "nested"
"properties" : {
"name" : { "type" : "string" },
"price" : { "type" : "double" }
}
}
}
}

Now let's say I want to know how many products have each reseller name.
That's not straight forward as far as I can tell.

If I do an agg like:

"aggs": {
"resellers": {
"nested": {
"path": "resellers"
},
"aggs": {
"names": {
"terms": {
"field": "resellers.name"
...

I'll get back something like ( assuming that each product has many
resellers):

hits: 20,
aggregations: {
resellers: {
doc_count: 100,
names : {
buckets: [
{
key: 'name1'
doc_count: 50
},
{
key: 'name2',
doc_count: 50
}
]
}}}

So, its aggregating on the nested objects, ie there are 100 reseller nested
docs, and in them 50 have name1, 50 have name2.

What I'm interested in though is how many products have resellers with name
1 and name 2.

IE, it should say something like,

  • 15 products have a reseller w/ name name1
  • 10 products have a reseller w/ name name2

It looks like I can do this by putting a reverse nested aggregation below
my names agg, then do a value_count aggregation on the ID of the root
object. This seems kind of round about and I wonder if I'm missing an
easier way. Any suggestions would be appreciated !

Thanks,
-Kal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1e895c15-accc-4638-b19c-ec1d263ca53b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Kallin Nagelberg) #2

I realized this could be simplified by simply leaving out the 'value_count'
aggregation within the reverse_nested, as that information is already
provided by the included 'doc_count'. I guess it can't be simplified much
beyond this.

Would it be worth including this information by default when doing a nested
agg (doc_count on the reverse_nested?). It seems pretty useful, but not
sure about performance implications of always doing it. An option
nested_aggs to return doc_count of the parent would be a nice to have for
sure!

On Thu, Jul 17, 2014 at 11:46 AM, Kallin Nagelberg <
kallin.nagelberg@gmail.com> wrote:

I'm trying to build a query to aggregate on some fields in a nested
document, but instead of returning the count of the nested documents for
each aggregation, I'd like to know the number of root objects.

IE.,

I have a mapping like (from the docs):

"product" : {
"properties" : {
"resellers" : {
"type" : "nested"
"properties" : {
"name" : { "type" : "string" },
"price" : { "type" : "double" }
}
}
}
}

Now let's say I want to know how many products have each reseller name.
That's not straight forward as far as I can tell.

If I do an agg like:

"aggs": {
"resellers": {
"nested": {
"path": "resellers"
},
"aggs": {
"names": {
"terms": {
"field": "resellers.name"
...

I'll get back something like ( assuming that each product has many
resellers):

hits: 20,
aggregations: {
resellers: {
doc_count: 100,
names : {
buckets: [
{
key: 'name1'
doc_count: 50
},
{
key: 'name2',
doc_count: 50
}
]
}}}

So, its aggregating on the nested objects, ie there are 100 reseller
nested docs, and in them 50 have name1, 50 have name2.

What I'm interested in though is how many products have resellers with
name 1 and name 2.

IE, it should say something like,

  • 15 products have a reseller w/ name name1
  • 10 products have a reseller w/ name name2

It looks like I can do this by putting a reverse nested aggregation below
my names agg, then do a value_count aggregation on the ID of the root
object. This seems kind of round about and I wonder if I'm missing an
easier way. Any suggestions would be appreciated !

Thanks,
-Kal

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/5f9iHPo5-Ps/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1e895c15-accc-4638-b19c-ec1d263ca53b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/1e895c15-accc-4638-b19c-ec1d263ca53b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAC7UURF8%3D_1-J4LyCADUOnOQ82Bc5dt_%3DYTAmAYV9isXXt6UJg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3