Exclude specific bucket with integer key from term aggregation

micpalmia · August 7, 2014, 3:54pm

Hi all!

My documents contain an integer array field storing the id of tags
describing them. Given a specific tag id, I want to extract a list of top
tags that occur most frequently together with the provided one.

I can solve this problem associating a term aggregation over the tag id
field to a term filter over the same field, but the list I get back
obviously always starts with the album id I provide: all documents matching
my filter have that tag, and it is thus the first in the list.I though of using
the exclude field
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_filtering_values
to avoid creating the problematic bucket, but as I'm dealing with an
integer field, that seems not to be possible: this query

{

"size": 0,
"query": {
"term": {
"tag_ids": "00001"
}
},
"aggs": {
"tags": {
"terms": {
"size": 3,
"field": "tag_ids",
"exclude": "00001"
}
}
}
}

returns an error saying that

Aggregation [tags] cannot support the include/exclude settings as it can

only be applied to string values.

Is it possible to avoid getting back this bucket in some way?
Unfortunately, I can only use ES 1.2 (AWS plugin not yet ready for 1.3).
I'm mostly afraid dealing with this problem after query execution, because
the bucket corresponding to the query is not guaranteed to be the first one
of the list, for example in case there are only a little matching
documents, all having exactly the same two tags.

Thank you in advance!
Michele

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c748f0bc-c2e9-4340-8936-f41345c71d55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Luke_Nezda · August 15, 2014, 6:31pm

I have this problem too - this was easily solved using the Terms Facet's
exclude feature
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_excluding_terms,
but I haven't found a solution within Elasticsearch (aggregations) to
this either. Here's a gist demonstrating this:
Troubles porting an elasticsearch Terms Facet to a Terms Aggregation · GitHub .

On Thursday, August 7, 2014 10:54:43 AM UTC-5, Michele Palmia wrote:

Hi all!

My documents contain an integer array field storing the id of tags
describing them. Given a specific tag id, I want to extract a list of
top tags that occur most frequently together with the provided one.

I can solve this problem associating a term aggregation over the tag id
field to a term filter over the same field, but the list I get back
obviously always starts with the album id I provide: all documents matching
my filter have that tag, and it is thus the first in the list.I though of using
the exclude field
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_filtering_values
to avoid creating the problematic bucket, but as I'm dealing with an
integer field, that seems not to be possible: this query

{

"size": 0,
"query": {
"term": {
"tag_ids": "00001"
}
},
"aggs": {
"tags": {
"terms": {
"size": 3,
"field": "tag_ids",
"exclude": "00001"
}
}
}
}

returns an error saying that

Aggregation [tags] cannot support the include/exclude settings as it can

only be applied to string values.

Is it possible to avoid getting back this bucket in some way?
Unfortunately, I can only use ES 1.2 (AWS plugin not yet ready for 1.3).
I'm mostly afraid dealing with this problem after query execution, because
the bucket corresponding to the query is not guaranteed to be the first one
of the list, for example in case there are only a little matching
documents, all having exactly the same two tags.

Thank you in advance!
Michele

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/af56ce57-48a0-4c75-b3c5-d2f9363fd881%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

micpalmia · August 22, 2014, 9:47am

I added a comment to an issue opened a while ago about the exclude feature
of term aggregations, on GitHub: I think this is something that should be
fixed.

github.com/elastic/elasticsearch

Aggs: filtering values using array of values

opened 03:57PM - 08 Jul 14 UTC

closed 03:17PM - 12 Sep 14 UTC

dadoonet

In facets, we can filter a Terms Facet using an [array of values](http://www.ela…sticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_excluding_terms): ``` json { "query" : { "match_all" : { } }, "facets" : { "tag" : { "terms" : { "field" : "tag", "exclude" : ["term1", "term2"] } } } } ``` In aggs, we can't use the same syntax anymore as [IncludeExclude](https://github.com/elasticsearch/elasticsearch/blob/1.2/src/main/java/org/elasticsearch/search/aggregations/bucket/terms/support/IncludeExclude.java#L141-155) does not support arrays. Same apply for include. We can probably use `|` character to separate terms but it could be handy to be able to specify directly an array of terms. cc @jpountz

On Fri, Aug 15, 2014 at 8:31 PM, Luke Nezda lnezda@gmail.com wrote:

I have this problem too - this was easily solved using the Terms Facet's
exclude feature
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_excluding_terms,
but I haven't found a solution within Elasticsearch (aggregations) to
this either. Here's a gist demonstrating this:
Troubles porting an elasticsearch Terms Facet to a Terms Aggregation · GitHub .

On Thursday, August 7, 2014 10:54:43 AM UTC-5, Michele Palmia wrote:

Hi all!

My documents contain an integer array field storing the id of tags
describing them. Given a specific tag id, I want to extract a list of
top tags that occur most frequently together with the provided one.

I can solve this problem associating a term aggregation over the tag
id field to a term filter over the same field, but the list I get back
obviously always starts with the album id I provide: all documents matching
my filter have that tag, and it is thus the first in the list.I though of using
the exclude field
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_filtering_values
to avoid creating the problematic bucket, but as I'm dealing with an
integer field, that seems not to be possible: this query

{

"size": 0,
"query": {
"term": {
"tag_ids": "00001"
}
},
"aggs": {
"tags": {
"terms": {
"size": 3,
"field": "tag_ids",
"exclude": "00001"
}
}
}
}

returns an error saying that

Aggregation [tags] cannot support the include/exclude settings as it can

only be applied to string values.

Is it possible to avoid getting back this bucket in some way?
Unfortunately, I can only use ES 1.2 (AWS plugin not yet ready for 1.3).
I'm mostly afraid dealing with this problem after query execution,
because the bucket corresponding to the query is not guaranteed to be the
first one of the list, for example in case there are only a little matching
documents, all having exactly the same two tags.

Thank you in advance!
Michele

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/8g74ov0run0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/af56ce57-48a0-4c75-b3c5-d2f9363fd881%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/af56ce57-48a0-4c75-b3c5-d2f9363fd881%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALkm6kdPV3Yk7oyU0_JRSaXweZvOWYpKEvSaB5ayp4o32dSgGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Regexp for integer aggregation Elasticsearch	1	502	March 15, 2017
Filter out buckets in an aggregated query Elasticsearch	3	1243	July 6, 2017
Exclude specific terms from term aggregation's buckets list Elasticsearch	11	14512	June 29, 2018
Writing aggregate with filtering Elasticsearch	5	4958	October 30, 2019
Filtering buckets from aggs Elasticsearch	11	829	March 12, 2017

Exclude specific bucket with integer key from term aggregation

Related topics