Hi,
I am trying to switch from having a word_list to a file with the list of words for the compound word token filter.
Right now I have it working defining the filter in the following way:
"compound_word_splitter":{
"type":"dictionary_decompounder",
"min_word_size":4,
"min_subword_size":3,
"word_list": ["icecream","smokehouse","car"]
}
I would like to have a separate file for the word list as the docs.
"compound_word_splitter":{
"type":"dictionary_decompounder",
"min_word_size":4,
"min_subword_size":3,
"word_list_path": "analysis/theWords.txt"
}
I tried having the word list file in the two following formats and the filter does not work:
"icecream","smokehouse","car"
icecream,smokehouse,car
Does anyone have any ideas what I am missing so that the list of words is recognized?
I have analysis/theWords.txt relative the config file.
Thank you.