I am trying to create some custom analyzer. definition:
My use case is
If I have an input
Remove stop words
Remove special characters
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException: failed
to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException: failed
to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more
You can use the mapping char filter if you are still on 0.90.1.
You cannot use a stop filter (effectively) with the keyword tokenizer. The
stop filter works on the tokens that are emitted from the tokenizer, so it
will not see the tokenized words.
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried new analyzer · GitHub
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(
ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.index.
analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
You cannot use a stop filter (effectively) with the keyword tokenizer. The
stop filter works on the tokens that are emitted from the tokenizer, so it
will not see the tokenized words.
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried new analyzer · GitHub
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.index.
analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.
You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried new analyzer · GitHub
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.index.
analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
Thanks a lot Ivan for the reply. Can i just remove the special characters
and white space from the token?
Example :
I am @testing Elastic-search
Should become
IamtestingElasticsearch
So that I can do fuzzy matches on that.
Thanks ,
Arjit
On Thu, Dec 5, 2013 at 9:24 AM, Ivan Brusic ivan@brusic.com wrote:
Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.
You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just
removes all the special characters and removes the stop words but makes it
only one token for the input. I tried new analyzer · GitHub
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.
PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
Sure, you could just use your special_char_pattern filter and extend it to
include whitespaces, or just create a similar filter for only whitespace
with regex \s+
On Thursday, December 5, 2013 5:01:27 AM UTC+1, Arjit Gupta wrote:
Thanks a lot Ivan for the reply. Can i just remove the special characters
and white space from the token?
Example :
I am @testing Elastic-search
Should become
IamtestingElasticsearch
So that I can do fuzzy matches on that.
Thanks ,
Arjit
On Thu, Dec 5, 2013 at 9:24 AM, Ivan Brusic <iv...@brusic.com<javascript:>
wrote:
Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.
--
Ivan
On Wed, Dec 4, 2013 at 6:24 PM, Arjit Gupta <arji...@gmail.com<javascript:>
wrote:
So, How can I just effectively remove stop words and still the input is
just one token ?
Thanks ,
Arjit
On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic <iv...@brusic.com<javascript:>
You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.
Cheers,
Ivan
On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta <arji...@gmail.com<javascript:>
wrote:
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just
removes all the special characters and removes the stop words but makes it
only one token for the input. I tried new analyzer · GitHub
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.
createChildInjector(InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.
MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.
java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.
PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
Sure, you could just use your special_char_pattern filter and extend it
to include whitespaces, or just create a similar filter for only whitespace
with regex \s+
On Thursday, December 5, 2013 5:01:27 AM UTC+1, Arjit Gupta wrote:
Thanks a lot Ivan for the reply. Can i just remove the special characters
and white space from the token?
Example :
I am @testing Elastic-search
Should become
IamtestingElasticsearch
So that I can do fuzzy matches on that.
Thanks ,
Arjit
On Thu, Dec 5, 2013 at 9:24 AM, Ivan Brusic iv...@brusic.com wrote:
Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.
--
Ivan
On Wed, Dec 4, 2013 at 6:24 PM, Arjit Gupta arji...@gmail.com wrote:
So, How can I just effectively remove stop words and still the input is
just one token ?
Thanks ,
Arjit
On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic iv...@brusic.com wrote:
You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.
Cheers,
Ivan
On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta arji...@gmail.com wrote:
Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just
removes all the special characters and removes the stop words but makes it
only one token for the input. I tried new analyzer · GitHub
On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic
wrote:
Sorry, sent early by mistake.
Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.
So if I have input term is
"This is a Elastic-Search Test01"
Output should be
"ElasticSearchTest01"
But when I build the index with the given definition it gives an
exception
Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.configure(
AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.configure(
AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$RecordingBinder
.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.getElements(
Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.build(
InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(Inject
orBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInje
ctor(InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.createChildIn
jector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexServic
e$1.execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThread
PoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedE
sThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.loadClas
s(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.getAsCla
ss(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.configure(
AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternRepla
ceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.loadClas
s(ImmutableSettings.java:346)
... 16 more
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f% 40googlegroups.com.
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN
0AufzM2feTujSPPPsQ%40mail.gmail.com.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CADe%2BHd92gDr1hX6%2BpvTQk%
2Bdx1VL3AfB3ZtUHrKrWpVUykiF6Hw%40mail.gmail.com.
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CALY%3DcQBp%2BzPEQ-PL776fQoRhHDjhO4m2v3NGULG6M7yV
%3D8LsJw%40mail.gmail.com.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.