Aggregations and special characters

Lucas_Rutledge · July 9, 2014, 5:08pm

I am performing an terms aggregation on a query to return the unique values
of a field, in this case the field being emails in the format
example@gmail.com or example@aaa.example.com.

"aggregations": {
"users_overall": {
"terms": {
"field": "email"
}
}
}
Instead of receiving back the full unique emails that I need I get results
such as these:

{
- key: gmail.com
- doc_count: 121864
  }
{
- key: yahoo.com
- doc_count: 68648
  }
{
- key: roadrunner.com
- doc_count: 58194
  }
{
- key: optimum.net
- doc_count: 35162
  }
{
- key: hotmail.com
- doc_count: 31407
  }
{
- key: nyc.rr.com
- doc_count: 24010
  }
{
- key: aol.com
- doc_count: 22502
  }
I've run into this problem with other fields that have values with
special characters in them as well, is there any way to perform an
aggregation like this that can ignore the special characters and return the
full value?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a0a6773c-3360-44fa-9acf-62a7ac05149e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan · July 9, 2014, 5:16pm

Aggregations work on the tokens for the specified field. These tokens are
generated when a tokenizer is applied to a field. In your case, you do not
want the field to be tokenized at all, so you would either need to define
is as not_analyzed or use a keyword tokenizer, which does not separate
tokens.

Cheers,

Ivan

On Wed, Jul 9, 2014 at 10:08 AM, Lucas Rutledge lrutledge1@gmail.com
wrote:

I am performing an terms aggregation on a query to return the unique
values of a field, in this case the field being emails in the format
example@gmail.com or example@aaa.example.com.

"aggregations": {
"users_overall": {
"terms": {
"field": "email"
}
}
}
Instead of receiving back the full unique emails that I need I get results
such as these:

{

key: gmail.com

doc_count: 121864
}

{

key: yahoo.com

doc_count: 68648
}

{

key: roadrunner.com

doc_count: 58194
}

{

key: optimum.net

doc_count: 35162
}

{

key: hotmail.com

doc_count: 31407
}

{

key: nyc.rr.com

doc_count: 24010
}

{

key: aol.com

doc_count: 22502
}

I've run into this problem with other fields that have values with
special characters in them as well, is there any way to perform an
aggregation like this that can ignore the special characters and return the
full value?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a0a6773c-3360-44fa-9acf-62a7ac05149e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a0a6773c-3360-44fa-9acf-62a7ac05149e%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBcBj0MWgtuxK1P3gdp9_9HHBXi33%3D%2BqTrOtQx5q%3Dk5ag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Include special Symbol Elasticsearch	4	298	July 6, 2017
Elastic search aggregations buckets counting email format as two different bucket key Elasticsearch	2	414	July 6, 2017
Term Aggregate on Analyzed Fields? Elasticsearch	2	409	July 6, 2017
Strange terms aggregation result: the single document is placed in several buckets if field contains some special symbols Elasticsearch	3	381	July 5, 2017
Unique values on the matching docs Elasticsearch	1	334	July 6, 2017

Aggregations and special characters

Related topics