Unexpected highlighting


(Валентин Ливкин) #1

Hello.

I don't understand highlighting logic of elasticsearch.
Here is my test:
Elasticsearch 1.7.1
Query:

{
    "query":{  
      "span_near":{  
         "clauses":[  
            {  
               "span_term":{  
                  "Text":{  
                     "value":"Test1"
                  }
               }
            },
            {  
               "span_or":{  
                  "clauses":[  
                     {  
                        "span_term":{  
                           "Text":{  
                              "value":"test2"
                           }
                        }
                     },
                     {  
                        "span_near":{  
                           "clauses":[  
                              {  
                                 "span_term":{  
                                    "Text":{  
                                       "value":"foo"
                                    }
                                 }
                              },
                              {  
                                 "span_term":{  
                                    "Text":{  
                                       "value":"test3"
                                    }
                                 }
                              }
                           ],
                           "slop":1,
                           "in_order":true
                        }
                     }
                  ]
               }
            }
         ],
         "slop":20,
         "in_order":false
      }
   },
    "highlight" : {
        "fields" : {
            "*" : {}
        }
    }
}

Example 1:
Text: "Test1 test3 test2"
Highlighting response: "<em>Test1</em> <em>test3</em> <em>test2</em>"
Highlight term "test3" is not expected, because in query there is span_near with two terms.

Example 2:
Text: "Test1 test2 test3"
Highlighting response will be: "<em>Test1</em> <em>test2</em> test3"
"test3" here is not highlighted.

Could someone explain me why "test3" is highlighted in first example?
P.S. In tests used plain highlighter.


(system) #2