Unexpected search results


(Radek) #1

Hi,

I have a problem formulating my search query. Could someone please
give me advice how to fix the following test case so that I find a
result every time? Query string is variable for me. Searching by _id
and padding _id with zeros is mandatory for the application from which
I extracted this use case.

$curl -X PUT "http://devel1:9200/test" -d '{"mappings" : {"item":
{"_id":{"index":"not_analyzed","store":"yes"},"properties":{"str":
{"type":"string"}}}}}'
$curl -X PUT "http://devel1:9200/test/item/0001" -d '{"item":{"str":"a
malibu house"}}'

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",

  "default_operator": "AND"
}

}
}'
=> OK, 1 hit. However the fields property has been omited from the
query.

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",
"fields": [
"_id",
"str"
],
"default_operator": "AND"
}
}
}'
=> No hits

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",
"fields": [
"str"
],
"default_operator": "AND"
}
}
}'
=> OK, 1 hit. I left out the _id field

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a",
"fields": [
"_id",
"str"
],
"default_operator": "AND"
}
}
}'
=> No hits

Thanks in advance,

Radek


(Shay Banon) #2

Its problematic because effectively what happens is for the one with no
hits ("a house" on _id and str) an AND boolean query is created on "a"
(dismax on _id, and not on str because its a stop word so its removed
during analysis) and "house" (dismax on _id and str). So, there is on match
on the first part, and because the operator is AND, it won't find anything.

This doesn't happen when you just query on str, because the analysis simply
removes the "a", and there is no query generated against _id (which is not
analyzed).

You can build a bool query yourself between two query_string, one on _id
and one on str, this will solve it. Note, you store the _id, which you will
always get back when you search, so not sure why you need to do it...

On Wed, Apr 18, 2012 at 5:02 PM, Radek radek.dvorak@gmail.com wrote:

Hi,

I have a problem formulating my search query. Could someone please
give me advice how to fix the following test case so that I find a
result every time? Query string is variable for me. Searching by _id
and padding _id with zeros is mandatory for the application from which
I extracted this use case.

$curl -X PUT "http://devel1:9200/test" -d '{"mappings" : {"item":
{"_id":{"index":"not_analyzed","store":"yes"},"properties":{"str":
{"type":"string"}}}}}'
$curl -X PUT "http://devel1:9200/test/item/0001" -d '{"item":{"str":"a
malibu house"}}'

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",

 "default_operator": "AND"

}
}
}'
=> OK, 1 hit. However the fields property has been omited from the
query.

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",
"fields": [
"_id",
"str"
],
"default_operator": "AND"
}
}
}'
=> No hits

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",
"fields": [
"str"
],
"default_operator": "AND"
}
}
}'
=> OK, 1 hit. I left out the _id field

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a",
"fields": [
"_id",
"str"
],
"default_operator": "AND"
}
}
}'
=> No hits

Thanks in advance,

Radek


(Radek) #3

Thanks for the answer. I will try building the bool query. I store the
_id field to
make it indexed; the document is searchable by id as well. The _id
used to
be indexed automatically some time ago.

On 19 dub, 16:51, Shay Banon kim...@gmail.com wrote:

Its problematic because effectively what happens is for the one with no
hits ("a house" on _id and str) an AND boolean query is created on "a"
(dismax on _id, and not on str because its a stop word so its removed
during analysis) and "house" (dismax on _id and str). So, there is on match
on the first part, and because the operator is AND, it won't find anything.

This doesn't happen when you just query on str, because the analysis simply
removes the "a", and there is no query generated against _id (which is not
analyzed).

You can build a bool query yourself between two query_string, one on _id
and one on str, this will solve it. Note, you store the _id, which you will
always get back when you search, so not sure why you need to do it...

On Wed, Apr 18, 2012 at 5:02 PM, Radek radek.dvo...@gmail.com wrote:

Hi,

I have a problem formulating my search query. Could someone please
give me advice how to fix the following test case so that I find a
result every time? Query string is variable for me. Searching by _id
and padding _id with zeros is mandatory for the application from which
I extracted this use case.

$curl -X PUT "http://devel1:9200/test" -d '{"mappings" : {"item":
{"_id":{"index":"not_analyzed","store":"yes"},"properties":{"str":
{"type":"string"}}}}}'
$curl -X PUT "http://devel1:9200/test/item/0001" -d '{"item":{"str":"a
malibu house"}}'

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",

 "default_operator": "AND"

}
}
}'
=> OK, 1 hit. However the fields property has been omited from the
query.

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",
"fields": [
"_id",
"str"
],
"default_operator": "AND"
}
}
}'
=> No hits

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a house",
"fields": [
"str"
],
"default_operator": "AND"
}
}
}'
=> OK, 1 hit. I left out the _id field

$curl -X POST "http://devel1:9200/test/_search" -d '{
"query": {
"query_string": {
"query": "a",
"fields": [
"_id",
"str"
],
"default_operator": "AND"
}
}
}'
=> No hits

Thanks in advance,

Radek


(system) #4