Hyphenation decompounder - how to use?

Babadofar · September 30, 2015, 11:05am

Hi!
I'm trying to use the hyphenation decompounder as described here[1], on elasticsearch 1.7.2.
I downloaded the hyphenation grammar files, trying to use both an inline list of words, a large file of words, but it is not able to split anything. I have tried using both Norwegian and English. Using the same word list with the dictionary decompounder produces a lot of tokens.

sample code: https://gist.github.com/babadofar/858df2b321f2209e20af

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-compound-word-tokenfilter.html

Babadofar · October 5, 2015, 11:34am

I still don't get this working, created an issue at github here: https://github.com/elastic/elasticsearch/issues/13935

Topic		Replies	Views
Why does hyphenation_decompounder require word_list? Elasticsearch	2	903	January 13, 2018
[Ann] Elasticsearch Word Decompound Plugin Elasticsearch	10	1434	January 9, 2013
Compound word token filter with german umlaute Elasticsearch	0	732	November 3, 2018
Multimatch with CROSS_FIELD query and decompounder Elasticsearch	1	454	February 14, 2022
Dictionary_decompounder needs lowercase word lists if "lowercase" filter is used for query -> gotcha/bug? Elasticsearch	1	740	July 11, 2016

Hyphenation decompounder - how to use?

Related topics