Too long values

Boris_Borshevsky · May 1, 2016, 9:11am

Hi,
I'm using elasticsearch 1.6.
while trying to make an aggregation query I've noticed i have a pretty large glitch in my results, after some investigation, I've seen that when the of the aggregation is too long (not sure how long around 200-400 characters), the query just ignores it in the aggregation and still, counts it in the doc count.

Is there any fix or workaround for this issue?

cbuescher · May 2, 2016, 9:21am

Hi,
can you post and example of your query along with what response you get and what you would expect as a response instead?

Boris_Borshevsky · May 2, 2016, 11:01am

Hi,
I've recreated something similar on my local machine
ive started a new index;
posted the following data to http://localhost:9200/test/testtype/2 :

{"gtype":"test1", "content":"testtesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttest" }

then tried this query:
GET /test/_search
{
"size": 0,
"aggs": {
"agg1": {
"terms": {
"field": "content",
"size": 3000,
"order": {
"_count": "desc"
}
}
}
}
}

and got this result:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0,
"hits": []
},
"aggregations": {
"agg1": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "sttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttest",
"doc_count": 1
},
{
"key": "testtesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttes",
"doc_count": 1
},
{
"key": "ttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttestte",
"doc_count": 1
}
]
}
}
}

which doesn't make sence.

cbuescher · May 2, 2016, 11:43am

Hi Boris,

thats a funky bug! I can reproduce this with ES 1.6, 1.7 and also 2.3.1. I might be missing something obvious but this looks so strange that I would ask you to open a github issue for this with the above reproduction.

cbuescher · May 2, 2016, 11:48am

A well no bug, I found the problem. The Standard Analyzer that is used when you don't specify an explicit string mapping is the culprit: if you look at https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html you can see that the parameter max_token_length will split your term at 255 characters by default. So for your long string you get several tokens that all go to different buckets.

Topic		Replies	Views
Is there any length limitation of the key in term aggregation ? I mean results display Elasticsearch	21	4567	July 5, 2017
Cannot aggregate long string Elasticsearch	2	1319	October 21, 2017
Max query length allowed in elasticsearch Elasticsearch	1	772	July 6, 2017
Terms Aggregation of long value Elasticsearch	8	1709	October 12, 2017
Aggregations Elasticsearch	7	532	July 6, 2017

Too long values

Related topics