Hi,
I'm on elasticsearch 7.0.0, And been getting the error in elasticsearch logs like below:
org.elasticsearch.transport.RemoteTransportException: [host][ip:9300][indices:data/read/search[phase/fetch/id]]
Caused by: java.lang.IllegalArgumentException: The length of [index.keyword] field of [some_doc] doc of [index] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting. This maximum can be set by changing the [index.highlight.max_analyzed_offset] index level setting. For large texts, indexing with offsets or term vectors is recommended!
at org.elasticsearch.search.fetch.subphase.highlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:88) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.search.fetch.subphase.highlight.HighlightPhase.hitExecute(HighlightPhase.java:107) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:169) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.search.SearchService.lambda$executeFetchPhase$3(SearchService.java:540) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.search.SearchService$3.doRun(SearchService.java:380) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.0.0.jar:7.0.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:835) [?:?]
I used term_vector for some large fields as suggested in the error message above. But it increases the index size (with_positions_offsets). And also after fixing few large fields, i started getting error for other fields whose values are large in only very few documents, but in most documents they are very small. (Is it okay to define those fields as term_vectors too?)
Seems like this error occurs for only [index.keyword] fields, which are i suppose used for aggregation.
So, looking at the error it seems like there are two ways to solve this problem.
- Disable highlighting on those indices. (Assuming the problem occurs only during highlighting).
- Disable aggregation on large fields. (Assuming the problem is only with .keyword fields)
Can you please suggest which is the right approach in this case?