Compound Words not found but Filter is configured


#1

Hi guys,

i experiment with search phrases of german compound words. The testword is "kennenlernen" which consists of the two words "kennen" and "lernen".
I configured the following settings to achive that i can find the word "kennenlernen" with the searchphrase "kennen".

Any ideas why this wont work or solutions for debugging the indexed terms?

Thanks a lot.

Settings:
"analysis": {
"filter": {
"german_stop": {
"type": "stop",
"stopwords": "german"
},
"german_stemmer": {
"type": "stemmer",
"language": "light_german"
},
"german_compound_filter": {
"type": "dictionary_decompounder",
"word_list": [
"kennenlernen",
"kennen",
"lernen"
],
"min_subword_size": 2
}
},
"analyzer": {
"german_mod": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"german_stop",
"german_normalization",
"german_stemmer",
"german_compound_filter"
]
}
}
}

Mapping:
"_all": {
"analyzer": "german_mod"
}

Search with no hits:
"query": {
"match": {
"_all": {
"query": "kennen"
}
}
}

Search with one hit:
"query": {
"match": {
"_all": {
"query": "kennenlernen"
}
}
}


#2

Good Morning,

is there any possibility to debug / analyze the query process to find out why the query cant find a hit? Other ideas?

Thanks a lot.


(Jörg Prante) #3

See https://www.elastic.co/guide/en/elasticsearch/guide/current/_validating_queries.html


(Jörg Prante) #4

And also https://www.elastic.co/guide/en/elasticsearch/reference/current/_explain_analyze.html


#5

Hi,

thanks for your help. I replaced the Compound Filter with a ngram Tokenizer for a generic solution, this works for me.

Thanks


(system) #6