Hyphenation decompounder - how to use?

(Christoffer Vig) #1

I'm trying to use the hyphenation decompounder as described here[1], on elasticsearch 1.7.2.
I downloaded the hyphenation grammar files, trying to use both an inline list of words, a large file of words, but it is not able to split anything. I have tried using both Norwegian and English. Using the same word list with the dictionary decompounder produces a lot of tokens.

sample code: https://gist.github.com/babadofar/858df2b321f2209e20af

  1. https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-compound-word-tokenfilter.html

(Christoffer Vig) #2

I still don't get this working, created an issue at github here: https://github.com/elastic/elasticsearch/issues/13935

(system) #3