Highlighting substrings in matched terms


(Robert Gründler) #1

Hi,

i'm trying to highlight my search results, and i'm not sure if i understand
the highlighting mechanism correctly. My document has 2 fields: "firstname"
and "lastname".

When searching for the term "jack", the document {firstname: "John",
lastname: "Jackson"} will be highlighted as {lastname: "Jackson"}.
I'd expect the highlighting
to be "Jackson" though.

Is the highlighting i'm getting the correct result, or am i missing
something in my query?

regards

-robert


(Lukáš Vlček) #2

Hi,

Imporatnat is the text analysis step in this case. Can this be because you
are using analyzer that collapses Jackson to the jack term?

Regars,
Lukas

On Tuesday, October 11, 2011, Robert Gründler r.gruendler@gmail.com wrote:

Hi,

i'm trying to highlight my search results, and i'm not sure if i
understand the highlighting mechanism correctly. My document has 2 fields:
"firstname" and "lastname".

When searching for the term "jack", the document {firstname: "John",
lastname: "Jackson"} will be highlighted as {lastname: "Jackson"}.
I'd expect the highlighting
to be "Jackson" though.

Is the highlighting i'm getting the correct result, or am i missing
something in my query?

regards

-robert


(Robert Gründler) #3

Imporatnat is the text analysis step in this case. Can this be because
you are using analyzer that collapses Jackson to the jack term?

I did not configure any specific analyzers, so i guess the defaults are
being used. The highlighting i described does happen the same way with
any other term too, so i guess it's not due to collapsing/stemming of
the input being indexed.

Could it be that i need to use a EdgeNGram Tokenizer for this kind of
highlighting during indexing phase?
(http://www.elasticsearch.org/guide/reference/index-modules/analysis/edgengram-tokenizer.html)

regards

-robert

Regars,
Lukas

On Tuesday, October 11, 2011, Robert Gründler <r.gruendler@gmail.com
mailto:r.gruendler@gmail.com> wrote:

Hi,

i'm trying to highlight my search results, and i'm not sure if i
understand the highlighting mechanism correctly. My document has 2
fields: "firstname" and "lastname".

When searching for the term "jack", the document {firstname: "John",
lastname: "Jackson"} will be highlighted as {lastname:
"Jackson"}. I'd expect the highlighting
to be "Jackson" though.

Is the highlighting i'm getting the correct result, or am i missing
something in my query?

regards

-robert


(Robert Gründler) #4

using a nGram filter during analysis solved the highlighting problem:

    settings = { 'index': {
        'analysis' : {
            'analyzer' : {                              
                'typeahead_analyzer' : {                    
                    'tokenizer' : 'lowercase',                        
                    'filter' : ['stop', 'ta_ngram'],
                    'type' : 'custom'
                }                              
            },
            'filter' : {
                'ta_ngram' : {                                  
                    'type' : 'nGram',
                    'max_gram' : 30,
                    'min_gram' : 2                                  
                }                            
            }
        }
    }}

(system) #5