Exception while creating a custom analyzer


(Arjit Gupta) #1

Hi,

I am trying to create some custom analyzer. definition:


My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException: failed
to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more

But the
example http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #2

Which version of elasticsearch are you using? According to the following
issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arjit292@gmail.com wrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException: failed
to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more

But the example
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBr-VNfgu8DMDRRj2gF-OdYqZk-vDxq9-CaLTc5_aS6-w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #3

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic ivan@brusic.com wrote:

Which version of elasticsearch are you using? According to the following
issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arjit292@gmail.com wrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more

But the example
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDcdq1OtFRP7MCh_qWeTkJnc8T4UPC07b2ERgpfbFVEqw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Arjit Gupta) #4

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic <iv...@brusic.com<javascript:>

wrote:

Which version of elasticsearch are you using? According to the following
issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta <arji...@gmail.com<javascript:>

wrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:189)
at
org.elasticsearch.common.inject.AbstractModule.configure(AbstractModule.java:60)
at
org.elasticsearch.common.inject.spi.Elements$RecordingBinder.install(Elements.java:201)
at
org.elasticsearch.common.inject.spi.Elements.getElements(Elements.java:82)
at
org.elasticsearch.common.inject.InjectorShell$Builder.build(InjectorShell.java:130)
at
org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:99)
at
org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:129)
at
org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:66)
at
org.elasticsearch.indices.InternalIndicesService.createIndex(InternalIndicesService.java:380)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:269)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:348)
at
org.elasticsearch.common.settings.ImmutableSettings.getAsClass(ImmutableSettings.java:336)
at
org.elasticsearch.index.analysis.AnalysisModule.configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at
org.elasticsearch.common.settings.ImmutableSettings.loadClass(ImmutableSettings.java:346)
... 16 more

But the example
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #5

You can use the mapping char filter if you are still on 0.90.1.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

You cannot use a stop filter (effectively) with the keyword tokenizer. The
stop filter works on the tokens that are emitted from the tokenizer, so it
will not see the tokenized words.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta arjit292@gmail.com wrote:

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried
https://gist.github.com/arjitgupta/7798768

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic iv...@brusic.com wrote:

Which version of elasticsearch are you using? According to the following
issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arji...@gmail.com wrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(
ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.index.
analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more

But the example http://www.elasticsearch.org/guide/en/
elasticsearch/reference/current/analysis-pattern-
replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN0AufzM2feTujSPPPsQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Arjit Gupta) #6

So, How can I just effectively remove stop words and still the input is
just one token ?

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic ivan@brusic.com wrote:

You can use the mapping char filter if you are still on 0.90.1.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

You cannot use a stop filter (effectively) with the keyword tokenizer. The
stop filter works on the tokens that are emitted from the tokenizer, so it
will not see the tokenized words.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta arjit292@gmail.com wrote:

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried
https://gist.github.com/arjitgupta/7798768

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the following
issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic iv...@brusic.com wrote:

Which version of elasticsearch are you using? According to the
following issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arji...@gmail.com wrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.index.
analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more

But the example http://www.elasticsearch.org/guide/en/
elasticsearch/reference/current/analysis-pattern-
replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN0AufzM2feTujSPPPsQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd92gDr1hX6%2BpvTQk%2Bdx1VL3AfB3ZtUHrKrWpVUykiF6Hw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #7

Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.

--
Ivan

On Wed, Dec 4, 2013 at 6:24 PM, Arjit Gupta arjit292@gmail.com wrote:

So, How can I just effectively remove stop words and still the input is
just one token ?

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic ivan@brusic.com wrote:

You can use the mapping char filter if you are still on 0.90.1.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta arjit292@gmail.com wrote:

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just removes
all the special characters and removes the stop words but makes it only one
token for the input. I tried
https://gist.github.com/arjitgupta/7798768

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic iv...@brusic.com wrote:

Which version of elasticsearch are you using? According to the
following issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arji...@gmail.com wrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.index.
analysis.patternreplace.PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more

But the example http://www.elasticsearch.org/guide/en/
elasticsearch/reference/current/analysis-pattern-
replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/c42ff608-e14c-4689-b0eb-41a197d10624%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN0AufzM2feTujSPPPsQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd92gDr1hX6%2BpvTQk%2Bdx1VL3AfB3ZtUHrKrWpVUykiF6Hw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBp%2BzPEQ-PL776fQoRhHDjhO4m2v3NGULG6M7yV%3D8LsJw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Arjit Gupta) #8

Thanks a lot Ivan for the reply. Can i just remove the special characters
and white space from the token?
Example :

I am @testing Elastic-search
Should become
IamtestingElasticsearch

So that I can do fuzzy matches on that.

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 9:24 AM, Ivan Brusic ivan@brusic.com wrote:

Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.

--
Ivan

On Wed, Dec 4, 2013 at 6:24 PM, Arjit Gupta arjit292@gmail.com wrote:

So, How can I just effectively remove stop words and still the input is
just one token ?

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic ivan@brusic.com wrote:

You can use the mapping char filter if you are still on 0.90.1.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta arjit292@gmail.com wrote:

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just
removes all the special characters and removes the stop words but makes it
only one token for the input. I tried
https://gist.github.com/arjitgupta/7798768

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic iv...@brusic.com wrote:

Which version of elasticsearch are you using? According to the
following issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arji...@gmail.comwrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(
InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.
execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.
PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more

But the example http://www.elasticsearch.org/guide/en/
elasticsearch/reference/current/analysis-pattern-
replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c42ff608-
e14c-4689-b0eb-41a197d10624%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN0AufzM2feTujSPPPsQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd92gDr1hX6%2BpvTQk%2Bdx1VL3AfB3ZtUHrKrWpVUykiF6Hw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBp%2BzPEQ-PL776fQoRhHDjhO4m2v3NGULG6M7yV%3D8LsJw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd-VK0JoYaN2V2fxRqpLccMi9nWQQix6dT1iubHYsG4REw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(sina.tamanna) #9

Sure, you could just use your special_char_pattern filter and extend it to
include whitespaces, or just create a similar filter for only whitespace
with regex \s+

On Thursday, December 5, 2013 5:01:27 AM UTC+1, Arjit Gupta wrote:

Thanks a lot Ivan for the reply. Can i just remove the special characters
and white space from the token?
Example :

I am @testing Elastic-search
Should become
IamtestingElasticsearch

So that I can do fuzzy matches on that.

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 9:24 AM, Ivan Brusic <iv...@brusic.com<javascript:>

wrote:

Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.

--
Ivan

On Wed, Dec 4, 2013 at 6:24 PM, Arjit Gupta <arji...@gmail.com<javascript:>

wrote:

So, How can I just effectively remove stop words and still the input is
just one token ?

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic <iv...@brusic.com<javascript:>

wrote:

You can use the mapping char filter if you are still on 0.90.1.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-mapping-charfilter.html

You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta <arji...@gmail.com<javascript:>

wrote:

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just
removes all the special characters and removes the stop words but makes it
only one token for the input. I tried
https://gist.github.com/arjitgupta/7798768

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic iv...@brusic.comwrote:

Which version of elasticsearch are you using? According to the
following issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arji...@gmail.comwrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.
configure(AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$
RecordingBinder.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.
getElements(Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.
build(InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(
InjectorBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.
createChildInjector(InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.
createChildInjector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.
MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.
java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.
PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.
runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.
getAsClass(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.
configure(AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.
PatternReplaceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.
loadClass(ImmutableSettings.java:346)
... 16 more

But the example http://www.elasticsearch.org/guide/en/
elasticsearch/reference/current/analysis-pattern-
replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c42ff608-
e14c-4689-b0eb-41a197d10624%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN0AufzM2feTujSPPPsQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd92gDr1hX6%2BpvTQk%2Bdx1VL3AfB3ZtUHrKrWpVUykiF6Hw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBp%2BzPEQ-PL776fQoRhHDjhO4m2v3NGULG6M7yV%3D8LsJw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0efa47b-62b5-464d-afa5-4ddac3fc0d0f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Arjit Gupta) #10

Thanks a lot Ivan and Sina :).

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 1:21 PM, Sina Tamanna sina.tamanna@gmail.com wrote:

Sure, you could just use your special_char_pattern filter and extend it
to include whitespaces, or just create a similar filter for only whitespace
with regex \s+

On Thursday, December 5, 2013 5:01:27 AM UTC+1, Arjit Gupta wrote:

Thanks a lot Ivan for the reply. Can i just remove the special characters
and white space from the token?
Example :

I am @testing Elastic-search
Should become
IamtestingElasticsearch

So that I can do fuzzy matches on that.

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 9:24 AM, Ivan Brusic iv...@brusic.com wrote:

Quite simply: you can't. At least not in any way I can think of. You can
fake it somewhat with the mapping/patternreplace filter, but you would have
to account for word boundaries as well which are normally taken care of by
the tokenizer.

--
Ivan

On Wed, Dec 4, 2013 at 6:24 PM, Arjit Gupta arji...@gmail.com wrote:

So, How can I just effectively remove stop words and still the input is
just one token ?

Thanks ,
Arjit

On Thu, Dec 5, 2013 at 7:18 AM, Ivan Brusic iv...@brusic.com wrote:

You can use the mapping char filter if you are still on 0.90.1.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/
current/analysis-mapping-charfilter.html

You cannot use a stop filter (effectively) with the keyword tokenizer.
The stop filter works on the tokens that are emitted from the tokenizer, so
it will not see the tokenized words.

Cheers,

Ivan

On Wed, Dec 4, 2013 at 5:38 PM, Arjit Gupta arji...@gmail.com wrote:

Elastic search version is 0.90.1 . I think the char filters came in
0.90.3.
Can you please tell me then how can I make an analyzer that just
removes all the special characters and removes the stop words but makes it
only one token for the input. I tried
https://gist.github.com/arjitgupta/7798768

On Wednesday, December 4, 2013 11:56:40 PM UTC+5:30, Ivan Brusic
wrote:

Sorry, sent early by mistake.

Which version of elasticsearch are you using? According to the
following issue, the filter was added in version 0.90.3.

https://github.com/elasticsearch/elasticsearch/issues/3197

Cheers,

Ivan

On Wed, Dec 4, 2013 at 10:18 AM, Ivan Brusic iv...@brusic.comwrote:

Which version of elasticsearch are you using? According to the
following issue, the

On Wed, Dec 4, 2013 at 9:51 AM, Arjit Gupta arji...@gmail.comwrote:

Hi,

I am trying to create some custom analyzer. definition:
https://gist.github.com/arjitgupta/7792134
My use case is
If I have an input

  1. Remove stop words
  2. Remove special characters

So if I have input term is
"This is a Elastic-Search Test01"
Output should be

"ElasticSearchTest01"

But when I build the index with the given definition it gives an
exception

Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException:
failed to find char filter type [pattern_replace] for [special_char_pattern]
at org.elasticsearch.index.analysis.AnalysisModule.configure(
AnalysisModule.java:189)
at org.elasticsearch.common.inject.AbstractModule.configure(
AbstractModule.java:60)
at org.elasticsearch.common.inject.spi.Elements$RecordingBinder
.install(Elements.java:201)
at org.elasticsearch.common.inject.spi.Elements.getElements(
Elements.java:82)
at org.elasticsearch.common.inject.InjectorShell$Builder.build(
InjectorShell.java:130)
at org.elasticsearch.common.inject.InjectorBuilder.build(Inject
orBuilder.java:99)
at org.elasticsearch.common.inject.InjectorImpl.createChildInje
ctor(InjectorImpl.java:129)
at org.elasticsearch.common.inject.ModulesBuilder.createChildIn
jector(ModulesBuilder.java:66)
at org.elasticsearch.indices.InternalIndicesService.createIndex(
InternalIndicesService.java:380)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexServic
e$1.execute(MetaDataCreateIndexService.java:269)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(
InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThread
PoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedE
sThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
lExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: org.elasticsearch.common.settings.NoClassSettingsException:
Failed to load class setting [type] with value [pattern_replace]
at org.elasticsearch.common.settings.ImmutableSettings.loadClas
s(ImmutableSettings.java:348)
at org.elasticsearch.common.settings.ImmutableSettings.getAsCla
ss(ImmutableSettings.java:336)
at org.elasticsearch.index.analysis.AnalysisModule.configure(
AnalysisModule.java:179)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.elasticsearch.index.analysis.patternreplace.PatternRepla
ceCharFilterFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.elasticsearch.common.settings.ImmutableSettings.loadClas
s(ImmutableSettings.java:346)
... 16 more

But the example http://www.elasticsearch.org/guide/en/
elasticsearch/reference/current/analysis-pattern-
replace-charfilter.html
is similar to what I wrote. Any clues to write a new analyzer ?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c42ff608-e14
c-4689-b0eb-41a197d10624%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/d39e6dc4-b0c6-4cfb-814b-880482fa854f%
40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CALY%3DcQAYSdNxvEiZoNnWNZEP4nR3sZDN
0AufzM2feTujSPPPsQ%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CADe%2BHd92gDr1hX6%2BpvTQk%
2Bdx1VL3AfB3ZtUHrKrWpVUykiF6Hw%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CALY%3DcQBp%2BzPEQ-PL776fQoRhHDjhO4m2v3NGULG6M7yV
%3D8LsJw%40mail.gmail.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/MJ56Ld98CGA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b0efa47b-62b5-464d-afa5-4ddac3fc0d0f%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd8FtNQemJbpJ8XKLHcLjed1ASRmdbb5kyMjDrRDMZhLfQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #11