Aggregating on nested fields

Is it possible to aggregate only on the nested documents that are returned
by a (filtered) query? For what I can tell when using a nested aggregation,
it will function on all nested documents of the parent documents whose
nested document satisfy a nested query/filter. Did that make sense? :slight_smile: Is
this the same limitation as issue #3022? I know that number by heart by now.

For example, I have 3 simple documents, where the nstd object is defined as
nested:

{
"name" : "foo",
"nstd" : [
{
"ID" : 1
}
]
}
'

{
"name" : "bar",
"nstd" : [
{
"ID" : 2
}
]
}
'

{
"name" : "baz",
"nstd" : [
{
"ID" : 1
},
{
"ID" : 2
}
]
}
'

I then execute a simple nested query:

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "nstd",
"filter": {
"term": {
"nstd.ID": 1
}
}
}
}
}
}

If I aggregate on the nstd.ID field, I will always get back results for
nested documents that were excluded by the filter:

        "buckets": [
           {
              "key": 1,
              "doc_count": 2
           },
           {
              "key": 2,
              "doc_count": 1
           }
        ]

Since the ID:2 field does not match the filter, it should not be returned
with the aggregation. I have tried using a filter aggregation with the same
filter used in the filtered query, but I receive the same results.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBsrKGntY-WT1PWZbTynxFvfw%2BYc7K2Q7a8NX3ive7t2w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reproducible gist: Nested aggregation issue · GitHub

Surely I cannot be the only one to have encountered this issue.

--
Ivan

On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic ivan@brusic.com wrote:

Is it possible to aggregate only on the nested documents that are returned
by a (filtered) query? For what I can tell when using a nested aggregation,
it will function on all nested documents of the parent documents whose
nested document satisfy a nested query/filter. Did that make sense? :slight_smile: Is
this the same limitation as issue #3022? I know that number by heart by now.

For example, I have 3 simple documents, where the nstd object is defined
as nested:

{
"name" : "foo",
"nstd" : [
{
"ID" : 1
}
]
}
'

{
"name" : "bar",
"nstd" : [
{
"ID" : 2
}
]
}
'

{
"name" : "baz",
"nstd" : [
{
"ID" : 1
},
{
"ID" : 2
}
]
}
'

I then execute a simple nested query:

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "nstd",
"filter": {
"term": {
"nstd.ID": 1
}
}
}
}
}
}

If I aggregate on the nstd.ID field, I will always get back results for
nested documents that were excluded by the filter:

        "buckets": [
           {
              "key": 1,
              "doc_count": 2
           },
           {
              "key": 2,
              "doc_count": 1
           }
        ]

Since the ID:2 field does not match the filter, it should not be returned
with the aggregation. I have tried using a filter aggregation with the same
filter used in the filtered query, but I receive the same results.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBUa1XbQRoAtgoj71kjrgHOW5PS%3Dr08U2ZDie9HXKs_2A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I suddenly remembered when using facets that I had to apply the same query
filter as a facet filter with the join option disabled. Turns out it is
somewhat identical with aggregations. My problem was that the scope of my
nested aggregation with not under the scope of the filter aggregation. I
hope #3022 and related issues can bring about less ambiguous aggregations.
Nested aggregations on pre-filtered nested documents should work as is. If
not, the global scope aggregation should be used.

--

Ivan

On Mon, Nov 10, 2014 at 3:43 PM, Ivan Brusic ivan@brusic.com wrote:

Reproducible gist: Nested aggregation issue · GitHub

Surely I cannot be the only one to have encountered this issue.

--
Ivan

On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic ivan@brusic.com wrote:

Is it possible to aggregate only on the nested documents that are
returned by a (filtered) query? For what I can tell when using a nested
aggregation, it will function on all nested documents of the parent
documents whose nested document satisfy a nested query/filter. Did that
make sense? :slight_smile: Is this the same limitation as issue #3022? I know that
number by heart by now.

For example, I have 3 simple documents, where the nstd object is defined
as nested:

{
"name" : "foo",
"nstd" : [
{
"ID" : 1
}
]
}
'

{
"name" : "bar",
"nstd" : [
{
"ID" : 2
}
]
}
'

{
"name" : "baz",
"nstd" : [
{
"ID" : 1
},
{
"ID" : 2
}
]
}
'

I then execute a simple nested query:

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "nstd",
"filter": {
"term": {
"nstd.ID": 1
}
}
}
}
}
}

If I aggregate on the nstd.ID field, I will always get back results for
nested documents that were excluded by the filter:

        "buckets": [
           {
              "key": 1,
              "doc_count": 2
           },
           {
              "key": 2,
              "doc_count": 1
           }
        ]

Since the ID:2 field does not match the filter, it should not be returned
with the aggregation. I have tried using a filter aggregation with the same
filter used in the filtered query, but I receive the same results.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi Ivan,

You indeed ned to repeat the filter under a nested aggregation to make it
work. If we ever allow queries to return nested documents, I agree that
filters should not be repeated under aggs, but since now queries only
return the root documents, I think it is actually consistent to return all
nested docs under a nested aggregation, and not only those that matched a
(potential) nested query. I also like the fact that it allows aggregations
to not know about the query.

On Tue, Nov 11, 2014 at 5:27 PM, Ivan Brusic ivan@brusic.com wrote:

I suddenly remembered when using facets that I had to apply the same query
filter as a facet filter with the join option disabled. Turns out it is
somewhat identical with aggregations. My problem was that the scope of my
nested aggregation with not under the scope of the filter aggregation. I
hope #3022 and related issues can bring about less ambiguous aggregations.
Nested aggregations on pre-filtered nested documents should work as is. If
not, the global scope aggregation should be used.

--

Ivan

On Mon, Nov 10, 2014 at 3:43 PM, Ivan Brusic ivan@brusic.com wrote:

Reproducible gist: Nested aggregation issue · GitHub

Surely I cannot be the only one to have encountered this issue.

--
Ivan

On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic ivan@brusic.com wrote:

Is it possible to aggregate only on the nested documents that are
returned by a (filtered) query? For what I can tell when using a nested
aggregation, it will function on all nested documents of the parent
documents whose nested document satisfy a nested query/filter. Did that
make sense? :slight_smile: Is this the same limitation as issue #3022? I know that
number by heart by now.

For example, I have 3 simple documents, where the nstd object is defined
as nested:

{
"name" : "foo",
"nstd" : [
{
"ID" : 1
}
]
}
'

{
"name" : "bar",
"nstd" : [
{
"ID" : 2
}
]
}
'

{
"name" : "baz",
"nstd" : [
{
"ID" : 1
},
{
"ID" : 2
}
]
}
'

I then execute a simple nested query:

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "nstd",
"filter": {
"term": {
"nstd.ID": 1
}
}
}
}
}
}

If I aggregate on the nstd.ID field, I will always get back results for
nested documents that were excluded by the filter:

        "buckets": [
           {
              "key": 1,
              "doc_count": 2
           },
           {
              "key": 2,
              "doc_count": 1
           }
        ]

Since the ID:2 field does not match the filter, it should not be
returned with the aggregation. I have tried using a filter aggregation with
the same filter used in the filtered query, but I receive the same results.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ANh6%2B8ieH9C6YeODm_0JPmMvdxfTa7WQBYVYnqtfpjA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I beg to differ, aggregations work with the root documents returned by
query, so they do not work under a global context. :slight_smile: I guess under my
proposed vision the issue would be then how to have aggregations on
documents returned with a nested filter, but still maintain all the nested
documents. A pseudo global nested context. The sounds more painful than
returning all the nested documents.

Cheers,

Ivan

On Tue, Nov 11, 2014 at 5:30 PM, Adrien Grand <
adrien.grand@elasticsearch.com> wrote:

Hi Ivan,

You indeed ned to repeat the filter under a nested aggregation to make it
work. If we ever allow queries to return nested documents, I agree that
filters should not be repeated under aggs, but since now queries only
return the root documents, I think it is actually consistent to return all
nested docs under a nested aggregation, and not only those that matched a
(potential) nested query. I also like the fact that it allows aggregations
to not know about the query.

On Tue, Nov 11, 2014 at 5:27 PM, Ivan Brusic ivan@brusic.com wrote:

I suddenly remembered when using facets that I had to apply the same
query filter as a facet filter with the join option disabled. Turns out it
is somewhat identical with aggregations. My problem was that the scope of
my nested aggregation with not under the scope of the filter aggregation. I
hope #3022 and related issues can bring about less ambiguous aggregations.
Nested aggregations on pre-filtered nested documents should work as is. If
not, the global scope aggregation should be used.

--

Ivan

On Mon, Nov 10, 2014 at 3:43 PM, Ivan Brusic ivan@brusic.com wrote:

Reproducible gist: Nested aggregation issue · GitHub

Surely I cannot be the only one to have encountered this issue.

--
Ivan

On Mon, Nov 10, 2014 at 12:53 PM, Ivan Brusic ivan@brusic.com wrote:

Is it possible to aggregate only on the nested documents that are
returned by a (filtered) query? For what I can tell when using a nested
aggregation, it will function on all nested documents of the parent
documents whose nested document satisfy a nested query/filter. Did that
make sense? :slight_smile: Is this the same limitation as issue #3022? I know that
number by heart by now.

For example, I have 3 simple documents, where the nstd object is
defined as nested:

{
"name" : "foo",
"nstd" : [
{
"ID" : 1
}
]
}
'

{
"name" : "bar",
"nstd" : [
{
"ID" : 2
}
]
}
'

{
"name" : "baz",
"nstd" : [
{
"ID" : 1
},
{
"ID" : 2
}
]
}
'

I then execute a simple nested query:

"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "nstd",
"filter": {
"term": {
"nstd.ID": 1
}
}
}
}
}
}

If I aggregate on the nstd.ID field, I will always get back results for
nested documents that were excluded by the filter:

        "buckets": [
           {
              "key": 1,
              "doc_count": 2
           },
           {
              "key": 2,
              "doc_count": 1
           }
        ]

Since the ID:2 field does not match the filter, it should not be
returned with the aggregation. I have tried using a filter aggregation with
the same filter used in the filtered query, but I receive the same results.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCqV%2BVtExWQ%2B15V6ywZ6pMog91ajB%3DEyN81ueditLGZKg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ANh6%2B8ieH9C6YeODm_0JPmMvdxfTa7WQBYVYnqtfpjA%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ANh6%2B8ieH9C6YeODm_0JPmMvdxfTa7WQBYVYnqtfpjA%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQC3MuY6NGq%3Du%3DDUmXpH4X0tRoK0TQoABHLtdYXrQb9ygw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.