Custom analyzer not applied on property in query


(Filip) #1

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo-bar

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?


(Shay Banon) #2

You only specified your analyzer on the _all field, and not all the other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.neven@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo-bar

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?


(Filip) #3

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at
org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at
org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo...

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?


(Shay Banon) #4

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.com wrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?


(Filip) #5

Yes, I tried recreating the index, first destroy the current one using:

curl -X DELETE http://localhost:9200/_all

and then execute the program that indexes the articles, but I get the NPE
for each index-attempt.

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.default.type:
my.elasticsearch.FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at

org.elasticsearch.index.analysis.FieldNameAnalyzer.reusableTokenStream(FieldNameAnalyzer.java:
60)
at

org.elasticsearch.common.lucene.all.AllTokenStream.allTokenStream(AllTokenStream.java:
38)
at

org.elasticsearch.common.lucene.all.AllField.tokenStreamValue(AllField.java:
64)
at

org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:
111)
at

org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:
276)
at

org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:
766)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
2060)
at

org.elasticsearch.index.engine.robin.RobinEngine.innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.index(RobinEngine.java:
479)
at

org.elasticsearch.index.shard.service.InternalIndexShard.index(InternalIndexShard.java:
323)
at

org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:
206)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction

$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:
532)
at

org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string) fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?


(Shay Banon) #6

A recreation means something that I can run on my end to try and recreate
it and see why it happens...

On Wed, May 16, 2012 at 1:28 PM, Filip Neven filip.neven@gmail.com wrote:

Yes, I tried recreating the index, first destroy the current one using:

curl -X DELETE http://localhost:9200/_all

and then execute the program that indexes the articles, but I get the NPE
for each index-attempt.

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?

Op dinsdag 15 mei 2012 22:41:05 UTC+2 schreef kimchy het volgende:

That NPE is strange..., is there a chance for a recreation?

On Mon, May 14, 2012 at 3:52 PM, Filip Neven filip.neven@gmail.comwrote:

I tried specifying my analyzer on each field individually, but that
didn't work (same effect as with _all). I have also tried setting it
as the default analyzer in elasticsearch.yml:

index.analysis.analyzer.**default.type:
my.elasticsearch.**FooAnalyzerProvider

But then I get this NPE when recreating the index for each store:

java.lang.NullPointerException
at
org.elasticsearch.index.analysis.FieldNameAnalyzer.
reusableTokenStream(**FieldNameAnalyzer.java:
60)
at
org.elasticsearch.common.lucene.all.AllTokenStream.
allTokenStream(AllTokenStream.**java:
38)
at
org.elasticsearch.common.lucene.all.AllField.
tokenStreamValue(AllField.**java:
64)
at
org.apache.lucene.index.**DocInverterPerField.processFields(
DocInverterPerField.java:
111)
at
org.apache.lucene.index.**DocFieldProcessorPerThread.*processDocument(
*DocFieldProcessorPerThread.java:
276)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(
DocumentsWriter.java:
766)
at org.apache.lucene.index.IndexWriter.addDocument(
IndexWriter.java:
2060)
at
org.elasticsearch.index.engine.robin.RobinEngine.
innerIndex(RobinEngine.java:
567)
at
org.elasticsearch.index.engine.robin.RobinEngine.
index(RobinEngine.java:
479)
at
org.elasticsearch.index.shard.service.InternalIndexShard.
index(InternalIndexShard.java:
323)
at
org.elasticsearch.action.index.TransportIndexAction.
shardOperationOnPrimary(TransportIndexAction.java:
206)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction.performOnPrimary(
TransportShardReplicationOpera
tionAction.java:
532)
at
org.elasticsearch.action.support.replication.
TransportShardReplicationOpera
tionAction
$AsyncShardOperationAction
$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.runTask(**ThreadPoolExecutor.java:886)
at java.util.concurrent.**ThreadPoolExecutor
$Worker.run(**ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.**java:680)

On May 13, 11:44 am, Shay Banon kim...@gmail.com wrote:

You only specified your analyzer on the _all field, and not all the
other
fields (title, content). If you want it to apply to all (string)
fields,
you can simply name it "default".

On Thu, May 10, 2012 at 1:03 PM, Filip filip.ne...@gmail.com wrote:

Hi,

I'm using a custom Lucene analyzer, made available through a plugin.
In my index mapping file, I specified this analyzer using:

{
"article" : {
"_all" : {
"indexAnalyzer" : "foo_analyzer",
"searchAnalyzer" : "foo_analyzer"
}
"properties" : {
"title" : {
"type" : "string",
"index" : "analyzed"
}
"content" : {
"type" : "string",
"index" : "analyzed"
}
...
}

It turns out that the analyzer is used properly when searching over
the entire article, e.g.
http://localhost:9200/**development_articles/_search?**q=foo-barhttp://localhost:9200/development_articles/_search?q=foo-bar

But the analyzer is not used when I search on a specific property, it
seems to use the default analyzer:
http://localhost:9200/development_articles/_search?
q=someproperty:foo.http://localhost:9200/development_articles/_search?q=someproperty:foo.
..

The goal of this analyzer is to make sure words concatenated with a
hyphen (foo-bar) are considered one word, and that the query doesn't
match on one part of the word only (foo or bar separately)

Any idea?


(system) #7