my issue is basically related to the hungarian language, but very strange.
Consider the following word: "alma" (apple in english).
This is a noun, but it's also a genitive of "alom" (bedding).
Here's the stange behaviour:
in ES 2.x if I searched with for the word "alma" with highlighting, I found related documents for the word "alma" with highlight around.
in ES 5.0.0 rc1 if I do the same, "alma" is not highlighted. If I search for the word "alom", I get back "alma" highlighted.
Both ES version finds the documents where "alma" or any other inflected form exists, but highlighting is strange.
Note that I installed hunspell just by copying the dictionary files to the proper place (/etc/elasticsearch/hunspell) with the hungarian files, and here's the configuration for my index:
Thanks, this does the trick, now it's working!
Another thing is that as you can see I indexed the field with the setting "term_vector": "with_positions_offsets_payloads".
Now when I set "type": "postings" in the highlight, I get the following response:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "the field [content] should be indexed with positions and offsets in the postings list to be used with postings highlighter"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query_fetch",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "test_string",
"node": "bMJ9xy_bRZ-qIDX4n4f-rA",
"reason": {
"type": "illegal_argument_exception",
"reason": "the field [content] should be indexed with positions and offsets in the postings list to be used with postings highlighter"
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "the field [content] should be indexed with positions and offsets in the postings list to be used with postings highlighter"
}
},
"status": 400
}
The number of available highlighters is an indication of how tricky the problem of highlighting can be.
Each of these was an attempt by one or more people to "fix" the problem of highlighting where previous highlighters had shortcomings.
Some of the highlighters address the problem by creating special data structures at index-time to support the highlighting process. These data structure choices are configured in the mapping definition and so turning on these options effectively dictates the choice of Highlighter implementation used by default for that field. Each Highlighter implementation documents the type of data structures it requires in the mapping e.g. the Postings highlighter [1]
Despite your mapping choices you can always revert to the "type:plain" highlighter as this does not require any special index structures (but as a consequence many not be as fast as other implementations). As you have discovered though, sometimes faster Highlighter implementations may not work as well as plain highlighter for certain choices of analyzers.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.