How to count filtered aggregations on nested types at parent level?


(Dan Testa) #1

I am trying to use aggregations on nested type. I want to apply a nested
filter to apply aggs to a subset of docs, but when I do that the "terms"
aggregation counts nested docs, not parent docs. In other words, if same
term is found in two different nested documents belong to the same parent,
count will be two. I want it to count only once per parent document. Using
"include_in_parent" does not help because then I cannot apply the nested
filter.

Here is my gist showing three different approaches:

  1. nested_aggs.sh: Did not use "include_in_parent". Counts "Role1" twice
    instead of once.

  2. nested_include.sh: Added "include_in_parent". Counts "Role1" once,
    but also returns "Role2" since I cannot filter the nested docs.

  3. nested_include_key.sh: My workaround for now. Using
    "include_in_parent" plus added my filter into new field called
    "roleAdminKey". Then on my aggregation I used "include" parameter to apply
    my filter.

I have posted my results in the gist as well. While #3 above works, my
actual mapping contains many more fields with multiple levels of nesting
and I'd like to be able to apply several other filtered aggregations on
nested types without having to add a "*Key" field for each one.

Is there a way to filter "terms" aggregations on nested types while
returning the count of parent docs, not nested docs?

Thanks,

Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6f3cfbf8-13aa-4c21-87e2-4eb54abc3995%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #2

It is not possible to count parent documents yet, but this will hopefully
be available in Elasticsearch 1.2.0 via the reverse_nested
aggregation[1], that would be able to translate back nested doc IDs to
parent doc IDs.

[1] https://github.com/elasticsearch/elasticsearch/issues/5485

On Tue, Apr 15, 2014 at 7:14 AM, Dan Testa danptesta@gmail.com wrote:

I am trying to use aggregations on nested type. I want to apply a nested
filter to apply aggs to a subset of docs, but when I do that the "terms"
aggregation counts nested docs, not parent docs. In other words, if same
term is found in two different nested documents belong to the same parent,
count will be two. I want it to count only once per parent document. Using
"include_in_parent" does not help because then I cannot apply the nested
filter.

Here is my gist showing three different approaches:

https://gist.github.com/dptesta/10688636

  1. nested_aggs.sh: Did not use "include_in_parent". Counts "Role1"
    twice instead of once.

  2. nested_include.sh: Added "include_in_parent". Counts "Role1" once,
    but also returns "Role2" since I cannot filter the nested docs.

  3. nested_include_key.sh: My workaround for now. Using
    "include_in_parent" plus added my filter into new field called
    "roleAdminKey". Then on my aggregation I used "include" parameter to apply
    my filter.

I have posted my results in the gist as well. While #3 above works, my
actual mapping contains many more fields with multiple levels of nesting
and I'd like to be able to apply several other filtered aggregations on
nested types without having to add a "*Key" field for each one.

Is there a way to filter "terms" aggregations on nested types while
returning the count of parent docs, not nested docs?

Thanks,

Dan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6f3cfbf8-13aa-4c21-87e2-4eb54abc3995%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6f3cfbf8-13aa-4c21-87e2-4eb54abc3995%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6CjYmxgNAzXcEySXjBVR3yU49ipiu0vfto3w5RB%3DddTw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Dan Testa) #3

This sounds exactly like what I need! I guess I can live with my
workaround until reverse_nested becomes available.

Thanks,
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0be5fcaf-b579-4412-b1e6-7d05a203c5f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4