Hello,
Is it correct that in order to use the PatternTokenizer, one would
need to implement a plugin similar to icu?
Thanks,
Paul
Hello,
Is it correct that in order to use the PatternTokenizer, one would
need to implement a plugin similar to icu?
Thanks,
Paul
Yes, but it can be part of the built in analyzers in elasticsearch (I assume
you refer to the one in Lucene).
-shay.banon
On Sun, Jul 25, 2010 at 12:28 PM, Paul ppearcy@gmail.com wrote:
Hello,
Is it correct that in order to use the PatternTokenizer, one would
need to implement a plugin similar to icu?Thanks,
Paul
Add this: Analysis: Add pattern analyzer · Issue #276 · elastic/elasticsearch · GitHub.
On Sun, Jul 25, 2010 at 9:50 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Yes, but it can be part of the built in analyzers in elasticsearch (I
assume you refer to the one in Lucene).-shay.banon
On Sun, Jul 25, 2010 at 12:28 PM, Paul ppearcy@gmail.com wrote:
Hello,
Is it correct that in order to use the PatternTokenizer, one would
need to implement a plugin similar to icu?Thanks,
Paul
Yeah, it probably makes sense to have it built in. I'd be happy to
create a fork and submit it. Would plan on exposing the pattern,
lowercase, and stopwords options that map directly to Lucene's
PatternAnalyzer inputs.
A separate pattern tokenizer would be nice to combine with other
options, but that doesn't appear to exist in Lucene (though Solr has a
more flexible version based on regex grouping that will probably be
available with the Lucene/Solr merge). Not that it would be hard to
write, just don't need it for my use case.
Thanks,
Paul
On Jul 25, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
Yes, but it can be part of the built in analyzers in elasticsearch (I assume
you refer to the one in Lucene).-shay.banon
On Sun, Jul 25, 2010 at 12:28 PM, Paul ppea...@gmail.com wrote:
Hello,
Is it correct that in order to use the PatternTokenizer, one would
need to implement a plugin similar to icu?Thanks,
Paul
Huh, somehow the Nabble (which shows your response referencing
Analysis: Add pattern analyzer · Issue #276 · elastic/elasticsearch · GitHub) and
google groups which doesn't are out of sync.
Anyway, thanks a ton! Seems straight forward and I'll let you know if
there are any issues.
Best Regards,
Paul
On Jul 25, 5:16 pm, Paul ppea...@gmail.com wrote:
Yeah, it probably makes sense to have it built in. I'd be happy to
create a fork and submit it. Would plan on exposing the pattern,
lowercase, and stopwords options that map directly to Lucene's
PatternAnalyzer inputs.A separate pattern tokenizer would be nice to combine with other
options, but that doesn't appear to exist in Lucene (though Solr has a
more flexible version based on regex grouping that will probably be
available with the Lucene/Solr merge). Not that it would be hard to
write, just don't need it for my use case.Thanks,
PaulOn Jul 25, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
Yes, but it can be part of the built in analyzers in elasticsearch (I assume
you refer to the one in Lucene).-shay.banon
On Sun, Jul 25, 2010 at 12:28 PM, Paul ppea...@gmail.com wrote:
Hello,
Is it correct that in order to use the PatternTokenizer, one would
need to implement a plugin similar to icu?Thanks,
Paul
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.