I need a filter to split in two words a word containing a suffix that
belongs to a list (Maybe a text file containing all the suffix) but I can't
find an existing filter doing that.
Does anyone have a solution to this?
If not, is there a way to write my own filter in Java and add it to
ElasticSearch ? : )
Off the top of my head, I cannot think of an existing filter that
accomplishes that task.
Creating a custom filter is easy. Simply creating a Lucene filter and
create a plug-in around it. Take a look at existing analysis plug-ins for
inspiration.
I need a filter to split in two words a word containing a suffix that
belongs to a list (Maybe a text file containing all the suffix) but I can't
find an existing filter doing that.
Does anyone have a solution to this?
If not, is there a way to write my own filter in Java and add it to
Elasticsearch ? : )
I wonder if you could use a Pattern Tokenizer in that case???
--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 12 mars 2015 à 04:32, Ivan Brusic ivan@brusic.com a écrit :
Off the top of my head, I cannot think of an existing filter that accomplishes that task.
Creating a custom filter is easy. Simply creating a Lucene filter and create a plug-in around it. Take a look at existing analysis plug-ins for inspiration.
I need a filter to split in two words a word containing a suffix that belongs to a list (Maybe a text file containing all the suffix) but I can't find an existing filter doing that.
Does anyone have a solution to this?
If not, is there a way to write my own filter in Java and add it to Elasticsearch ? : )
Yes, that's an idea : )
Pattern Tokenizer seems to give the results I want but I don't know if it's
possible to define patterns based on a list of specific words.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.