Highlight in Elasticsearch 5.4


(Arpan Sahoo) #1

How to highlight partial match in ES 5.4?

Example- I have a FirstName field = "Jason"
And i am quering as :
"query": {
"bool": {
"should": [
{
"match": {
"FirstName.Prefix": {
"query": "Ja"
}
}
}
]
}
}

I am getting the desired output. But can't find a way to highlight only "Ja" in the search result. i.e my search result should be something like:
"_source": {
"FirstName": "Jason"
},
"highlight": {
"FirstName": [
"Jason"
]
}

Can it be possible?


(Adrien Grand) #2

You would need to use the edge-ngram tokenizer rather than token filter to get it to work this way. This is due to the fact that token filters are not allowed to modify offsets that have been set by the tokenizer.


(Arpan Sahoo) #3
"settings": {
"index": {
  "analysis": {
    "filter": {
      "first_name_synonym_filter": {
        "type": "synonym",
        "synonyms": [
          "aaron,erin,ronnie,ron",
          "abbie,abby,abigail",
          etc..
        ]
      },
      "prefix_filter": {
        "type": "edgeNGram",
        "max_gram": "30"
      },
      "alphanum_filter": {
        "pattern": "\\W",
        "type": "pattern_replace",
        "replacement": ""
      }
    },
    "analyzer": {
      "basic_analyzer": {
        "filter": [
          "standard",
          "lowercase",
          "asciifolding",
          "alphanum_filter"
        ],
        "type": "custom",
        "tokenizer": "uax_url_email"
      },
      "first_name_synonym_analyzer": {
        "filter": [
          "standard",
          "lowercase",
          "asciifolding",
          "alphanum_filter",
          "first_name_synonym_filter"
        ],
        "type": "custom",
        "tokenizer": "uax_url_email"
      },
      "basic_analyzer_prefix": {
        "filter": [
          "standard",
          "lowercase",
          "asciifolding",
          "alphanum_filter",
          "prefix_filter"
        ],
        "type": "custom",
        "tokenizer": "uax_url_email"
      }
    }
  }
}
  "mappings": {
"Consumer": {
  "_all": {
    "enabled": false
  },
  "properties": {        
    "Email": {
      "type": "text",
      "fields": {
        "Prefix": {
          "type": "text",
          "analyzer": "basic_analyzer_prefix",
          "search_analyzer": "basic_analyzer"
        }
      },
      "analyzer": "basic_analyzer"
    },
    "FirstName": {
      "type": "text",
      "fields": {
        "Synonym": {
          "type": "text",
          "analyzer": "first_name_synonym_analyzer",
          "search_analyzer": "basic_analyzer"
        }
      },
      "copy_to": [
        "FullName"
      ],
      "analyzer": "basic_analyzer"
    },
    "FullName": {
      "type": "text",
      "fields": {
        "Prefix": {
          "type": "text",
          "analyzer": "basic_analyzer_prefix",
          "search_analyzer": "basic_analyzer"
        }
      },
      "analyzer": "basic_analyzer"
    },        
    "LastName": {
      "type": "text",
      "fields": {
        "Exact": {
          "type": "keyword"
        }
      },
      "copy_to": [
        "FullName"
      ],
      "analyzer": "basic_analyzer"
    }
  }
}    

}
Thanks for the reply, I understood what you are pointing to, it's really helpful.

The above is my setting and mapping, I am confused how I can use your suggestion.

My query is:

"query": {
"bool": { "should": [        
    {"match": {
        "FullName.Prefix": { "query": "Ryan" }}},
    {"match": {
        "Email.Prefix": { "query": "Ryan" }}}        
  ],      
  "minimum_should_match": 1}},
"highlight": {
"fields": {
  "FirstName": {"require_field_match": false},
  "LastName": {"require_field_match": false},
  "Email": {"require_field_match": false},
}

}

Do i need to implement a prefix analyzer for each of these fields to highlight?
Is there an alternate way to achieve this!!


(Adrien Grand) #4

You can try out the following (untested, there might be typos). It won't match exactly like your current settings/mappings, but hopefully this will help you move forward.

"settings": {
"index": {
  "analysis": {
    "tokenizer": {
      "prefix_tokenizer": {
        "type": "edge_ngram",
        "max_gram": "30",
        "token_chars": [ "letter", "punctuation" ]
      }
    }
    "filter": {
      "first_name_synonym_filter": {
        "type": "synonym",
        "synonyms": [
          "aaron,erin,ronnie,ron",
          "abbie,abby,abigail",
          etc..
        ]
      },
      "alphanum_filter": {
        "pattern": "\\W",
        "type": "pattern_replace",
        "replacement": ""
      }
    },
    "analyzer": {
      "basic_analyzer": {
        "filter": [
          "standard",
          "lowercase",
          "asciifolding",
          "alphanum_filter"
        ],
        "type": "custom",
        "tokenizer": "uax_url_email"
      },
      "first_name_synonym_analyzer": {
        "filter": [
          "standard",
          "lowercase",
          "asciifolding",
          "alphanum_filter",
          "first_name_synonym_filter"
        ],
        "type": "custom",
        "tokenizer": "uax_url_email"
      },
      "basic_analyzer_prefix": {
        "filter": [
          "standard",
          "lowercase",
          "asciifolding",
          "alphanum_filter"
        ],
        "type": "custom",
        "tokenizer": "prefix_tokenizer"
      }
    }
  }
}
  "mappings": {
"Consumer": {
  "_all": {
    "enabled": false
  },
  "properties": {        
    "Email": {
      "type": "text",
      "fields": {
        "Prefix": {
          "type": "text",
          "analyzer": "basic_analyzer_prefix",
          "search_analyzer": "basic_analyzer"
        }
      },
      "analyzer": "basic_analyzer"
    },
    "FirstName": {
      "type": "text",
      "fields": {
        "Synonym": {
          "type": "text",
          "analyzer": "first_name_synonym_analyzer",
          "search_analyzer": "basic_analyzer"
        }
      },
      "copy_to": [
        "FullName"
      ],
      "analyzer": "basic_analyzer"
    },
    "FullName": {
      "type": "text",
      "fields": {
        "Prefix": {
          "type": "text",
          "analyzer": "basic_analyzer_prefix",
          "search_analyzer": "basic_analyzer"
        }
      },
      "analyzer": "basic_analyzer"
    },        
    "LastName": {
      "type": "text",
      "fields": {
        "Exact": {
          "type": "keyword"
        }
      },
      "copy_to": [
        "FullName"
      ],
      "analyzer": "basic_analyzer"
    }
  }
}    

(Arpan Sahoo) #5

Thanks a lot.. :slight_smile:


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.