How can I put together a case-insensitive analyzer for tokens?

pulkitsinghal · November 19, 2013, 5:16pm

My input tokens are like:
abcd-123
ABCD-123
abCD-123

Right now I don't analyze them at all but that comes back to bite me if
someone searches for them with the wrong case-sensitivity.

So I want to use or put together an analyzer that doesn't break these
tokens apart but still allows them to be analyzed for case-insensitive
search later. Any suggestions?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mattweber · November 19, 2013, 5:25pm

Configure a custom analyzer with the keyword tokenizer and lowercase token
filter.

index :
analysis :
analyzer :
lowerKeyword:
type : custom
tokenizer : keyword
filter : [lowercase]

Thanks,
Matt Weber

On Tue, Nov 19, 2013 at 9:16 AM, pulkitsinghal pulkitsinghal@gmail.comwrote:

My input tokens are like:
abcd-123
ABCD-123
abCD-123

Right now I don't analyze them at all but that comes back to bite me if
someone searches for them with the wrong case-sensitivity.

So I want to use or put together an analyzer that doesn't break these
tokens apart but still allows them to be analyzed for case-insensitive
search later. Any suggestions?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

pulkitsinghal · November 19, 2013, 9:06pm

index :
analysis :
analyzer :
lowerKeyword:
type : custom
tokenizer : keyword
filter : [lowercase]

Thanks a lot Matt!

Can anyone additionally tell me how to set this up programmatically via
java?
I don't know how to set a name (like lowerKeyword) for the analyzer ...
here's what I have so far:
indexerSettings.put("analysis.analyzer.events.type", "custom");
indexerSettings.put("analysis.analyzer.events.tokenizer", "keyword");
indexerSettings.put("analysis.analyzer.events.filter", "lowercase");

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

pulkitsinghal · November 19, 2013, 11:48pm

Cihat Keser, over from Jest https://github.com/searchbox-io/Jest, pointed
out that the string "events" in the code block below is what constitutes as
the name for an analyzer:

 indexerSettings.put("analysis.analyzer.events.type", "custom");

 indexerSettings.put("analysis.analyzer.events.tokenizer", "keyword");
 indexerSettings.put("analysis.analyzer.events.filter", "lowercase");

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Case-insensitive term query Elasticsearch	3	2979	January 20, 2017
Requesting help with Case-insensitive Analyzer Elasticsearch	3	413	March 27, 2024
Case insensitive search on not analyzed fields Elasticsearch	3	2123	July 5, 2017
Aggregations failing on fields with custom analyzer Elasticsearch	12	949	July 6, 2017
Case Insensitive Term Filters Elasticsearch	2	1621	July 6, 2017

How can I put together a case-insensitive analyzer for tokens?

Related topics