Document contains at least one immense term in field="REGIONS" (whose UTF8 e

LDA-MAN · December 29, 2015, 1:14am

when i build index for my data, a field not analyzed got this problem,who's length is too long?
i fix it with ignore_above but not seems better,the issue append again!
this is my index mapping:
.startObject("REGIONS")
.field("type","string")
.field("store","yes")
.field("index","not_analyzed")
.field("ignore_above","100000")
.endObject()
the exception like this:

Joshua_Rich · December 29, 2015, 4:14am

You are hitting Lucene’s term byte-length limit of 32766 for this field. Note that the 32766 is a byte-length limit, while ignore_above setting is a character count limit. This is an important distinction because depending on your text, you may require multiple bytes to store a single character.

So you should really set ignore_above much lower, I'd suggest a realistic character count that will stop the extreme cases from being indexed. However, you may be better off just setting index: no for this field, so it is not searchable at all (but can be retrieved in the results).

LDA-MAN · December 29, 2015, 4:37am

Thanks,i set it to 256 and solve this problem！

Topic		Replies	Views
Please correct the analyzer to not produce such terms Elasticsearch	2	2755	July 5, 2017
How big a field can be Elasticsearch	3	23540	December 28, 2018
IllegalArgumentException: Document contains at least one immense term in field=“abc”.(whose UTF8 encoding is longer than the max length 32766) Elasticsearch	3	2429	September 11, 2017
Document contains at least one immense term in field .. again Elasticsearch	1	431	October 29, 2019
Ignore_above is not working with analyzer and multi fields Elasticsearch	3	547	August 28, 2018

Document contains at least one immense term in field="REGIONS" (whose UTF8 e

Related topics