Prefix ending in dot


(George Sakkis) #1

Hi all,

it looks like searching for a prefix that ends in a dot does not find
documents where a term matches exactly the prefix. Here's an example:

curl -XDELETE 'http://localhost:9200/test'
curl -XPUT 'http://localhost:9200/test/'

curl -XPUT 'http://localhost:9200/test/country/1' -d '{"name": "US of
A"}'
curl -XPUT 'http://localhost:9200/test/country/2' -d '{"name": "USA"}'

curl -XPUT 'http://localhost:9200/test/country/3' -d '{"name": "U.S of
A"}'
curl -XPUT 'http://localhost:9200/test/country/4' -d '{"name": "U.S.
of A"}'
curl -XPUT 'http://localhost:9200/test/country/5' -d '{"name":
"U.S.A"}'

curl -XPOST 'http://localhost:9200/test/_refresh'

finds (1) and (2): ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true' -d
'{"query": {"prefix": {"name": "us"}}}'

finds (3), (4) and (5): ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true' -d
'{"query": {"prefix": {"name": "u.s"}}}'

finds (5) but not (4): not ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true' -d
'{"query": {"prefix": {"name": "u.s."}}}'

Is this expected behavior or a bug?

Thanks,
George


(Shay Banon) #2

prefix query is not analyzed and it expects to find the prefix in a single
term, use the analyze API to see how the text is broken down. You can use
the text query in prefix mode for this.

On Mon, Jan 9, 2012 at 4:20 PM, George Sakkis george.sakkis@gmail.comwrote:

Hi all,

it looks like searching for a prefix that ends in a dot does not find
documents where a term matches exactly the prefix. Here's an example:

curl -XDELETE 'http://localhost:9200/test'
curl -XPUT 'http://localhost:9200/test/'

curl -XPUT 'http://localhost:9200/test/country/1' -d '{"name": "US of
A"}'
curl -XPUT 'http://localhost:9200/test/country/2' -d '{"name": "USA"}'

curl -XPUT 'http://localhost:9200/test/country/3' -d '{"name": "U.S of
A"}'
curl -XPUT 'http://localhost:9200/test/country/4' -d '{"name": "U.S.
of A"}'
curl -XPUT 'http://localhost:9200/test/country/5' -d '{"name":
"U.S.A"}'

curl -XPOST 'http://localhost:9200/test/_refresh'

finds (1) and (2): ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true' -d
'{"query": {"prefix": {"name": "us"}}}'

finds (3), (4) and (5): ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true' -d
'{"query": {"prefix": {"name": "u.s"}}}'

finds (5) but not (4): not ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true' -d
'{"query": {"prefix": {"name": "u.s."}}}'

Is this expected behavior or a bug?

Thanks,
George


(George Sakkis) #3

Ah got it now, the standard analyzer strips of the trailing dot from
"U.S." which causes the prefix search to fail.

Many thanks!

On Jan 9, 8:52 pm, Shay Banon kim...@gmail.com wrote:

prefix query is not analyzed and it expects to find the prefix in a single
term, use the analyze API to see how the text is broken down. You can use
the text query in prefix mode for this.

On Mon, Jan 9, 2012 at 4:20 PM, George Sakkis george.sak...@gmail.comwrote:

Hi all,

it looks like searching for a prefix that ends in a dot does not find
documents where a term matches exactly the prefix. Here's an example:

curl -XDELETE 'http://localhost:9200/test'
curl -XPUT 'http://localhost:9200/test/'

curl -XPUT 'http://localhost:9200/test/country/1'-d '{"name": "US of
A"}'
curl -XPUT 'http://localhost:9200/test/country/2'-d '{"name": "USA"}'

curl -XPUT 'http://localhost:9200/test/country/3'-d '{"name": "U.S of
A"}'
curl -XPUT 'http://localhost:9200/test/country/4'-d '{"name": "U.S.
of A"}'
curl -XPUT 'http://localhost:9200/test/country/5'-d '{"name":
"U.S.A"}'

curl -XPOST 'http://localhost:9200/test/_refresh'

finds (1) and (2): ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true'-d
'{"query": {"prefix": {"name": "us"}}}'

finds (3), (4) and (5): ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true'-d
'{"query": {"prefix": {"name": "u.s"}}}'

finds (5) but not (4): not ok

curl -s 'http://localhost:9200/test/country/_search?pretty=true'-d
'{"query": {"prefix": {"name": "u.s."}}}'

Is this expected behavior or a bug?

Thanks,
George


(system) #4