Querystring not being analyzed


(Robert Eanes) #1

I'm trying to get a simple case-insensitive search to work by running
the querystring through the "standard" analyzer, but it doesn't seem
to be working for me. The index seems to be run through the analyzer,
as a querystring that is all lowercase matches a mixed-case record. A
mixed or uppercase querystring never matches though. I've tried
explicitly setting the analyzer parameter, with no effect.
I'm using 0.4.0, with no changes to the config file (which is empty).

Thanks for any help you can give me - here is an example session
illustrating the problem:

~ $ curl -XPUT http://localhost:9200/twitter/tweet/1 -d
'
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"1"}

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:kimchy"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
1,"hits":[{"_index":"twitter","_type":"tweet","_id":"1", "_source" :
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
}]}}

So far, so good. The lowercase term matched the mixed-case original,
but then:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}

With explicit analyzer specified:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY&analyzer=standard"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}


(Shay Banon) #2

Hi,

Yep, this is a bug in how I handle this case in the enhanced query parser
in elasticsearch. I have already pushed a fixed. Can you give it a go?

-shay.banon

On Tue, Feb 23, 2010 at 7:18 PM, Robert Eanes reanes@gmail.com wrote:

I'm trying to get a simple case-insensitive search to work by running
the querystring through the "standard" analyzer, but it doesn't seem
to be working for me. The index seems to be run through the analyzer,
as a querystring that is all lowercase matches a mixed-case record. A
mixed or uppercase querystring never matches though. I've tried
explicitly setting the analyzer parameter, with no effect.
I'm using 0.4.0, with no changes to the config file (which is empty).

Thanks for any help you can give me - here is an example session
illustrating the problem:

~ $ curl -XPUT http://localhost:9200/twitter/tweet/1 -d
'
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"1"}

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:kimchy"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
1,"hits":[{"_index":"twitter","_type":"tweet","_id":"1", "_source" :
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
}]}}

So far, so good. The lowercase term matched the mixed-case original,
but then:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}

With explicit analyzer specified:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY&analyzer=standard"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}


(Robert Eanes) #3

Thanks for responding so quickly. I'd love to test it out, but I'm
having trouble building the latest version from github. This is
almost certainly due to my very limited java experience, rather than
an actual problem with the code. For instance I'd never heard of
gradle until today. In any case, I don't want to bother you with
that, just knowing it's fixed for an upcoming release is great for
me. Thanks again!

On Feb 23, 3:16 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

Yep, this is a bug in how I handle this case in the enhanced query parser
in elasticsearch. I have already pushed a fixed. Can you give it a go?

-shay.banon

On Tue, Feb 23, 2010 at 7:18 PM, Robert Eanes rea...@gmail.com wrote:

I'm trying to get a simple case-insensitive search to work by running
the querystring through the "standard" analyzer, but it doesn't seem
to be working for me. The index seems to be run through the analyzer,
as a querystring that is all lowercase matches a mixed-case record. A
mixed or uppercase querystring never matches though. I've tried
explicitly setting the analyzer parameter, with no effect.
I'm using 0.4.0, with no changes to the config file (which is empty).

Thanks for any help you can give me - here is an example session
illustrating the problem:

~ $ curl -XPUThttp://localhost:9200/twitter/tweet/1-d
'
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"1"}

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:kimchy"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
1,"hits":[{"_index":"twitter","_type":"tweet","_id":"1", "_source" :
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
}]}}

So far, so good. The lowercase term matched the mixed-case original,
but then:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}

With explicit analyzer specified:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY&analyzer=standard"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}


(Lukáš Vlček) #4

I think you will not bother anybody when you share your problems. That is
what user mail list if for :slight_smile:
You can always clone fresh ES code into new directory and run gradlew. This
will build fresh ES instance ready for your tests.

Regards,
Lukas

On Wed, Feb 24, 2010 at 4:22 AM, Robert Eanes reanes@gmail.com wrote:

Thanks for responding so quickly. I'd love to test it out, but I'm
having trouble building the latest version from github. This is
almost certainly due to my very limited java experience, rather than
an actual problem with the code. For instance I'd never heard of
gradle until today. In any case, I don't want to bother you with
that, just knowing it's fixed for an upcoming release is great for
me. Thanks again!

On Feb 23, 3:16 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

Yep, this is a bug in how I handle this case in the enhanced query
parser
in elasticsearch. I have already pushed a fixed. Can you give it a go?

-shay.banon

On Tue, Feb 23, 2010 at 7:18 PM, Robert Eanes rea...@gmail.com wrote:

I'm trying to get a simple case-insensitive search to work by running
the querystring through the "standard" analyzer, but it doesn't seem
to be working for me. The index seems to be run through the analyzer,
as a querystring that is all lowercase matches a mixed-case record. A
mixed or uppercase querystring never matches though. I've tried
explicitly setting the analyzer parameter, with no effect.
I'm using 0.4.0, with no changes to the config file (which is empty).

Thanks for any help you can give me - here is an example session
illustrating the problem:

~ $ curl -XPUThttp://localhost:9200/twitter/tweet/1-d
'
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"1"}

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:kimchy"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
1,"hits":[{"_index":"twitter","_type":"tweet","_id":"1", "_source" :
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
}]}}

So far, so good. The lowercase term matched the mixed-case original,
but then:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}

With explicit analyzer specified:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY&analyzer=standard"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}


(Shay Banon) #5

Also, make sure you have Java 6 JDK installed (
http://java.sun.com/javase/downloads/widget/jdk6.jsp), and for simplicity,
set JAVA_HOME to point to the installation, and add $JDK_HOME/bin to your
path.

-shay.banon

On Wed, Feb 24, 2010 at 9:09 AM, Lukáš Vlček lukas.vlcek@gmail.com wrote:

I think you will not bother anybody when you share your problems. That is
what user mail list if for :slight_smile:
You can always clone fresh ES code into new directory and run gradlew. This
will build fresh ES instance ready for your tests.

Regards,
Lukas

On Wed, Feb 24, 2010 at 4:22 AM, Robert Eanes reanes@gmail.com wrote:

Thanks for responding so quickly. I'd love to test it out, but I'm
having trouble building the latest version from github. This is
almost certainly due to my very limited java experience, rather than
an actual problem with the code. For instance I'd never heard of
gradle until today. In any case, I don't want to bother you with
that, just knowing it's fixed for an upcoming release is great for
me. Thanks again!

On Feb 23, 3:16 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

Yep, this is a bug in how I handle this case in the enhanced query
parser
in elasticsearch. I have already pushed a fixed. Can you give it a go?

-shay.banon

On Tue, Feb 23, 2010 at 7:18 PM, Robert Eanes rea...@gmail.com wrote:

I'm trying to get a simple case-insensitive search to work by running
the querystring through the "standard" analyzer, but it doesn't seem
to be working for me. The index seems to be run through the analyzer,
as a querystring that is all lowercase matches a mixed-case record. A
mixed or uppercase querystring never matches though. I've tried
explicitly setting the analyzer parameter, with no effect.
I'm using 0.4.0, with no changes to the config file (which is empty).

Thanks for any help you can give me - here is an example session
illustrating the problem:

~ $ curl -XPUThttp://localhost:9200/twitter/tweet/1-d
'
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"1"}

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:kimchy"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
1,"hits":[{"_index":"twitter","_type":"tweet","_id":"1", "_source" :
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
}]}}

So far, so good. The lowercase term matched the mixed-case original,
but then:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}

With explicit analyzer specified:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY&analyzer=standard"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}


(Robert Eanes) #6

Thanks! I was able to successfully build the second time I tried with a clean checkout, and I can confirm that the test in my original email now works as expected. In case you are interested, here is the log of what I saw during the build, including the failed and successful attempts. Note that I had to kill it after the stack trace appeared in the first attempt as it was hanging:

~/src $ git clone git://github.com/elasticsearch/elasticsearch.git
Initialized empty Git repository in /Users/reanes/src/elasticsearch/.git/
remote: Counting objects: 4169, done.
remote: Compressing objects: 100% (1909/1909), done.
remote: Total 4169 (delta 2408), reused 3516 (delta 1867)
Receiving objects: 100% (4169/4169), 1.34 MiB | 893 KiB/s, done.
Resolving deltas: 100% (2408/2408), done.
~/src $ cd elasticsearch
~/src/elasticsearch(master) $ java -version
java version "1.6.0_17"
Java(TM) SE Runtime Environment (build 1.6.0_17-b04-248-10M3025)
Java HotSpot(TM) 64-Bit Server VM (build 14.3-b01-101, mixed mode)
~/src/elasticsearch(master) $ ./gradlew build devRelease
:test-testng:compileJava
:elasticsearch:compileJava
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
:test-testng:processResources
:elasticsearch:processResources
:test-testng:classes
:elasticsearch:classes
:test-testng:jar
:elasticsearch:jar
:elasticsearch:uploadDefaultInternal
:benchmark-micro:compileJava
:benchmark-micro:processResources
:benchmark-micro:classes
:benchmark-micro:jar
:test-testng:assemble
:elasticsearch:assemble
:benchmark-micro:assemble
:test-testng:compileTestJava
:test-testng:uploadDefaultInternal
:elasticsearch:compileTestJava
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
:benchmark-micro:compileTestJava
:test-testng:processTestResources
:elasticsearch:processTestResources
:benchmark-micro:processTestResources
:test-testng:testClasses
:elasticsearch:testClasses
:benchmark-micro:testClasses
:test-testng:test
:elasticsearch:test
[ant:testng] ........................................
[ant:testng] ........................................
[ant:testng] ...........Exception in thread "elasticsearch[tp]-pool-10-thread-1" java.lang.AssertionError:
[ant:testng] Expected: "Action [sayHelloException] not found"
[ant:testng] got: "bad message !!!"
[ant:testng]
[ant:testng] at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:21)
[ant:testng] at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
[ant:testng] at org.elasticsearch.transport.netty.SimpleNettyTransportTests$4.handleException(SimpleNettyTransportTests.java:133)
[ant:testng] at org.elasticsearch.transport.PlainTransportFuture.handleException(PlainTransportFuture.java:143)
[ant:testng] at org.elasticsearch.transport.netty.MessageChannelHandler$2.run(MessageChannelHandler.java:125)
[ant:testng] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[ant:testng] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[ant:testng] at java.lang.Thread.run(Thread.java:637)

^C~/src/elasticsearch(master) $ ./gradlew build devRelease
:test-testng:compileJava
:elasticsearch:compileJava
:test-testng:processResources
:elasticsearch:processResources
:test-testng:classes
:elasticsearch:classes
:test-testng:jar
:elasticsearch:jar
:elasticsearch:uploadDefaultInternal
:benchmark-micro:compileJava
:benchmark-micro:processResources
:benchmark-micro:classes
:benchmark-micro:jar
:test-testng:assemble
:elasticsearch:assemble
:benchmark-micro:assemble
:test-testng:compileTestJava
:test-testng:uploadDefaultInternal
:elasticsearch:compileTestJava
:benchmark-micro:compileTestJava
:test-testng:processTestResources
:elasticsearch:processTestResources
:benchmark-micro:processTestResources
:test-testng:testClasses
:elasticsearch:testClasses
:benchmark-micro:testClasses
:test-testng:test
:elasticsearch:test
[ant:testng] ........................................
[ant:testng] ........................................
[ant:testng] ........................................
[ant:testng] .......
:benchmark-micro:test
:test-testng:check
:elasticsearch:check
:benchmark-micro:check
:test-testng:build
:elasticsearch:build
:benchmark-micro:build
:test-integration:compileJava
:test-integration:processResources
:test-integration:classes
:test-integration:jar
[ant:jar] Warning: skipping jar archive /Users/reanes/src/elasticsearch/modules/test/integration/build/libs/elasticsearch-test-integration-0.5.0.jar because no files were included.
[ant:jar] Warning: skipping jar archive /Users/reanes/src/elasticsearch/modules/test/integration/build/libs/elasticsearch-test-integration-0.5.0.jar because no files were included.
:test-integration:assemble
:test-integration:compileTestJava
Note: /Users/reanes/src/elasticsearch/modules/test/integration/src/test/java/org/elasticsearch/test/integration/indexlifecycle/IndexLifecycleActionTests.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
:test-integration:processTestResources
:test-integration:testClasses
:test-integration:test
[ant:testng] ........................................
[ant:testng] ...........
:test-integration:check
:test-integration:build
:explodedDist
:zip
:devRelease

BUILD SUCCESSFUL

On Feb 24, 2010, at 3:16 AM, Shay Banon wrote:

Also, make sure you have Java 6 JDK installed (http://java.sun.com/javase/downloads/widget/jdk6.jsp), and for simplicity, set JAVA_HOME to point to the installation, and add $JDK_HOME/bin to your path.

-shay.banon

On Wed, Feb 24, 2010 at 9:09 AM, Lukáš Vlček lukas.vlcek@gmail.com wrote:
I think you will not bother anybody when you share your problems. That is what user mail list if for :slight_smile:
You can always clone fresh ES code into new directory and run gradlew. This will build fresh ES instance ready for your tests.

Regards,
Lukas

On Wed, Feb 24, 2010 at 4:22 AM, Robert Eanes reanes@gmail.com wrote:
Thanks for responding so quickly. I'd love to test it out, but I'm
having trouble building the latest version from github. This is
almost certainly due to my very limited java experience, rather than
an actual problem with the code. For instance I'd never heard of
gradle until today. In any case, I don't want to bother you with
that, just knowing it's fixed for an upcoming release is great for
me. Thanks again!

On Feb 23, 3:16 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

Yep, this is a bug in how I handle this case in the enhanced query parser
in elasticsearch. I have already pushed a fixed. Can you give it a go?

-shay.banon

On Tue, Feb 23, 2010 at 7:18 PM, Robert Eanes rea...@gmail.com wrote:

I'm trying to get a simple case-insensitive search to work by running
the querystring through the "standard" analyzer, but it doesn't seem
to be working for me. The index seems to be run through the analyzer,
as a querystring that is all lowercase matches a mixed-case record. A
mixed or uppercase querystring never matches though. I've tried
explicitly setting the analyzer parameter, with no effect.
I'm using 0.4.0, with no changes to the config file (which is empty).

Thanks for any help you can give me - here is an example session
illustrating the problem:

~ $ curl -XPUThttp://localhost:9200/twitter/tweet/1-d
'
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
'
{"ok":true,"_index":"twitter","_type":"tweet","_id":"1"}

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:kimchy"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
1,"hits":[{"_index":"twitter","_type":"tweet","_id":"1", "_source" :
{
user : "Kimchy",
postDate : "2009-11-15T14:12:12",
message : "trying out Elastic Search"
}
}]}}

So far, so good. The lowercase term matched the mixed-case original,
but then:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}

With explicit analyzer specified:

~ $ curl -XGET "http://localhost:9200/twitter/tweet/_search?
q=user:KIMCHY&analyzer=standard"
{"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":
0,"hits":[]}}


(system) #7