Adding filter to existing analyzer


(barnybug) #1

Would like to add asciifolding to the default english analyzer. Is this
possible?
Or alternatively can I define a new analyzer with the same
tokenizer/filters as the EnglishAnalyzer?

thanks

Barnaby


(Jan Fiedler) #2

I believe your best bet is to define a custom analyzer. Find documentation
on how to set this up herehttp://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer.html.
Your custom analyzer can use a similar filter chain as the default analyzer
(but I am not sure whether and where this is documented).


(barnybug) #3

Thanks. So my alternate question - is there a way of knowing what the
default analyzers define in terms of tokenizers/filters?

Digging into the code suggest they delegate to Lucene's built-in analyzers
(e.g. EnglishAnalyzer for english), which is fairly opaque as to what it's
doing..

Barnaby

On Tuesday, 8 May 2012 14:43:49 UTC+1, Jan Fiedler wrote:

I believe your best bet is to define a custom analyzer. Find documentation
on how to set this up herehttp://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer.html.
Your custom analyzer can use a similar filter chain as the default analyzer
(but I am not sure whether and where this is documented).


(Shay Banon) #4

Here is what is used for the EnglishAnalyzer:

tokenizer: standard
filter: standard, stemmer (possessive_english), lowercase, stop, porter_stem

On Tue, May 8, 2012 at 7:23 PM, barnybug barnybug@googlemail.com wrote:

Thanks. So my alternate question - is there a way of knowing what the
default analyzers define in terms of tokenizers/filters?

Digging into the code suggest they delegate to Lucene's built-in analyzers
(e.g. EnglishAnalyzer for english), which is fairly opaque as to what it's
doing..

Barnaby

On Tuesday, 8 May 2012 14:43:49 UTC+1, Jan Fiedler wrote:

I believe your best bet is to define a custom analyzer. Find
documentation on how to set this up herehttp://www.elasticsearch.org/guide/reference/index-modules/analysis/custom-analyzer.html.
Your custom analyzer can use a similar filter chain as the default analyzer
(but I am not sure whether and where this is documented).


(system) #5