When running a span_near query in ES 1.7, the highlighting took account of in_order and slop when it highlighted terms, however in ES 2.3 this no longer seems to be the case. My case extends this one:
Here's how you'd repeat it.
curl -XPUT 'http://localhost:9200/twitter/' -d '{
"mappings": {
"tweet": {
"properties": {
"message": {
"type": "string",
"store": true
}
}
}
}
}'
curl -XPUT 'localhost:9200/twitter/tweet/1?refresh=true' -d '{
"message" : ["short leg twice syndrome","short syndrome"]
}'
Then run this span_near query:
curl -XGET 'http://localhost:9200/twitter/tweet/_search?pretty' -d '{
"query" : {
"span_near" : {
"clauses" : [
{"span_term": {"message": "short"}},
{"span_term": {"message": "syndrome"}}
],
"slop": 0,
"in_order": true
}
},
"highlight": {"fields": {"message": {"type": "plain"} } }
}'
in ES 1.7.1 you get:
{
"took" : 56,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.11072598,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.11072598,
"_source":{
"message" : ["short leg twice syndrome","other words short syndrome"]
},
"highlight" : {
"message" : [ **"other words <em>short</em> <em>syndrome</em>"** ]
}
} ]
}
}
in ES 2.3.2:
{
"took" : 56,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.19178301,
"hits" : [ {
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_score" : 0.19178301,
"_source" : {
"message" : [ "short leg twice syndrome", "other words short syndrome" ]
},
"highlight" : {
"message" : [ **"<em>short</em> leg twice <em>syndrome</em>", "other words <em>short</em> <em>syndrome</em>"** ]
}
} ]
}
}
In 2.3.2 , 'short' and 'syndrome' in "short leg twice syndrome" get highlighted, even though they don't fulfill the terms of the slop/inorder parts of the span_near query.
Is this an intentional change ? If it is, then is there a way to work around it to get back to the 1.7.1-stye highlighting where slop/inorder get accounted for ?
Many thanks,
Phil