Part-of-word matching challenge

Hi,

We're looking for a way to match on part of words for single term queries but not all words.

For example (it's dutch):

  • searching for 'verlichting' should match on 'tuinverlichting' or 'wandverlichting' but
  • searching for 'pop' should NOT match on 'popular'
  • searching for 'bed' should match on 'beddengoed' or 'dekbedden' but NOT on 'bedrading' or bedrijf'

What is the best strategy?

Maarten

One method is to implement a word decompounding token filter for dutch language.

Hi,

Thanks for your reply. I've looked at https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-compound-word-tokenfilter.html#_dictionary_decompounder but the documentation is lacking.

I've tried, see https://gist.github.com/anonymous/f6f3067b02af50928751127a1e351e63
but somehow it does not work at all.

Maybe you can spot the errror? I'm using es 5.2.

Is this what you meant by the way?

Thanks,
Maarten

figured out the problem of decompounding, had to do because i put the quotes around the [ in stead of only the words. And in combination with stemmer_override i could manually override words for stemming i got better results.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.