Customize a Dictionary_Decompounder

Hello Elasticsearch experts,

I have a question about writing a custom analyzer using a dictionary_decompounder.

Here is a simplified example of the current mappings and settings of my index.

PUT my_index
{
"mappings": {
"properties": {
"content": {
"type": "text"
}
}
},
"settings": {
"analysis": {
"filter": {
"decomp_de":{
"type": "dictionary_decompounder",
"word_list":["dreh", "moment", "schlüssel"]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["decomp_de"]
}
}
}
}
}

When using the custom analyzer, I see that the decompounding is working as expected:

GET /my_index/_analyze
{
"text": ["drehmomentschlüssel"],
"analyzer": "my_analyzer"{
"content": "schlüssel additional text“
}
}

As a result I get the tokens (drehmomentschlüssel, dreh, moment, schlüssel).

Next I index following two documents:
Document 1:

{"content":"drehmomentschlüssel additional text"}

Document 2:

{"content": "schlüssel additional text“}

Currently when I am searching for
(1)drehmomentschlüssel we get Document1, Document2
(2)schlüssel we get Document1, Document2

My desired result is the following:
Searching for
(1)drehmomentschlüssel we get Document1
(2)schlüssel we get Document1, Document2

So when indexing the decompounding hast to be done. Otherwise search (2) would not return Document1. So I have to use a different analyzer during search time.
Any ideas?

please share your query. It looks as if you are combining terms with an OR instead of an AND.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.