hi,
I'm using elastic search 2.4 - phrase match query with synonyms. Following is the setup and issues faced. Please suggest. I'm trying to index and access the index in the same session.
The input to the following index settings is lemmatized text (from wordnet)-
"Updating Mobile Phone number in BANK Updating Mobile Phone number in BANK.txt "
nlp_settings_cssections = {
"settings": {
"number_of_shards": 1,
"analysis": {
"char_filter": {
"my_char_filter": {
"type": "mapping",
"mappings": [
"\n => .",
"1 => 1"]
}
},
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms": []
},
"english_stop": {
"type": "stop",
"stopwords": "english"
},
"english_stemmer": {
"type": "stemmer",
"language": "light_english"
},
"english_possessive_stemmer": {
"type": "stemmer",
"language": "possessive_english"
}
},
"analyzer": {
"my_english_analyzer": {
"type": "custom",
"char_filter": [
"my_char_filter"
],
"tokenizer": "standard",
"filter": [
"english_possessive_stemmer",
"lowercase",
"my_synonym_filter",
"english_stemmer"
]
}
}
}
},
"mappings": {
"section": {
"properties": {
"seccontent": {
"type": "string",
"analyzer": "my_english_analyzer"
}}}}
}
Synonyms
option 1. "click on service catalog, change, updating, changing, update"
option 2. "click on service catalog, change, updating, changing => update"
Phrase Query
query_phrase_cssections = json.dumps(
{"from": 0, "size": 20,
"query": {
"match_phrase": {
"seccontent": {
"query": "%s",
"slop": 3
}
}
}
}
Issue -
I use synonym option 1, i get result for phrase "update phone"
I use synonym option 2, i don't get results.
Synonym option 1 seem to be getting results from other documents where only one of
update / phone are present in the document at distance of slop 3.