Question about range query


(caphrim007) #1

Hi,

I'm using ES 0.17.7 and making GET requests via the "_search?q=" URI
just for testing purposes before I turn the query into a JSON string.

I'm trying to do a range query. Here it is

/_search?q=sourceSize:[990+TO+991]

And it is returning documents with the following sourceSize fields and
vals.

"sourceSize": "9906",
"sourceSize": "9909112",
"sourceSize": "990"

I interpret the range query to be values literally from 990 to 991,
which should then only return the "990" document above. But it appears
that ES is interpreting the query to be something like [990*+TO+991*].

Can I use the _search endpoint to have it do a literal value search
instead of what appears to be a wildcard search?

Thanks in advance


(Karussell) #2

Why not use a range query?

http://www.elasticsearch.org/guide/reference/query-dsl/range-query.html

The normal scenario is to have a POST query with an json query in it.

Also do you have multiple identical properties per document or a
mapping configuration?

Peter.

On 8 Dez., 20:28, caphrim007 caphrim...@gmail.com wrote:

Hi,

I'm using ES 0.17.7 and making GET requests via the "_search?q=" URI
just for testing purposes before I turn the query into a JSON string.

I'm trying to do a range query. Here it is

/_search?q=sourceSize:[990+TO+991]

And it is returning documents with the following sourceSize fields and
vals.

"sourceSize": "9906",
"sourceSize": "9909112",
"sourceSize": "990"

I interpret the range query to be values literally from 990 to 991,
which should then only return the "990" document above. But it appears
that ES is interpreting the query to be something like [990*+TO+991*].

Can I use the _search endpoint to have it do a literal value search
instead of what appears to be a wildcard search?

Thanks in advance


(caphrim007) #3

I am using a range query. I already answered why I'm not using JSON
in the question I asked. In case you missed it.

"""
...and making GET requests via the "_search?q=" URI
just for testing purposes before I turn the query into a JSON string.
"""

My documents have fields. Not sure what you mean by identical, because
a document cannot have more than one field with the same name, but if
you mean do I have documents that have identical field names, then
yes, some documents have the same field names as other documents. My
example illustrates that; the "sourceSize" field.

The mapping configuration is the default supplied with ES.

-Tim

On Dec 8, 2:37 pm, Karussell tableyourt...@googlemail.com wrote:

Why not use a range query?

http://www.elasticsearch.org/guide/reference/query-dsl/range-query.html

The normal scenario is to have a POST query with an json query in it.

Also do you have multiple identical properties per document or a
mapping configuration?

Peter.

On 8 Dez., 20:28, caphrim007 caphrim...@gmail.com wrote:

Hi,

I'm using ES 0.17.7 and making GET requests via the "_search?q=" URI
just for testing purposes before I turn the query into a JSON string.

I'm trying to do a range query. Here it is

/_search?q=sourceSize:[990+TO+991]

And it is returning documents with the following sourceSize fields and
vals.

"sourceSize": "9906",
"sourceSize": "9909112",
"sourceSize": "990"

I interpret the range query to be values literally from 990 to 991,
which should then only return the "990" document above. But it appears
that ES is interpreting the query to be something like [990*+TO+991*].

Can I use the _search endpoint to have it do a literal value search
instead of what appears to be a wildcard search?

Thanks in advance


(Karussell) #4

The questions were only asked because this is the normal way of
helping to track it down ...

give us the docs you are indexing and the query (aka curl recreation)
as it should work. I fear your are indexing the values as strings:

"sourceSize": "9906"

and not as number

"sourceSize": 9906

Peter.


(caphrim007) #5

yep, I'm sure I'm indexing them as strings; the _mapping endpoint confirms
that.

How do range queries work with strings? Is the answer "the behavior is
undefined"?

Is ES looking for anything that begins with the "from" value, in this case
anything that begins with 990 ? and also looking for anything that begins
with 901 ?

My documents are kinda big, see attachment for 1 of them. The rest of the
couple hundred thousand are permutations of that. If you want to test it on
small scale, just create a couple docs with 1 field, a numeric that is
stored as a string, and try to do a range query on it (using my example).

"sourceSize": "9906",
"sourceSize": "9909112",
"sourceSize": "990"

Is what you would get. But I don't "get" how the 1st or 2nd entries qualify
as a match, because even as strings they are longer than my range. So I
don't understand why they match. I might understand if ES returned

"sourceSize": "992"
"sourceSize": "993"
"sourceSize": "994"

Because in that case, the strings are at least the same length as the
values in the range query (900 and 901) even if the returned matches are
"incorrect" from a numeric point of view.

How do range queries for string values work in ES?

-Tim


(Shay Banon) #6

strings based range query is lexicography. If you want numeric ones, you
can map them as numbers (or index them as numbers).

On Fri, Dec 9, 2011 at 11:22 PM, caphrim007 caphrim007@gmail.com wrote:

yep, I'm sure I'm indexing them as strings; the _mapping endpoint confirms
that.

How do range queries work with strings? Is the answer "the behavior is
undefined"?

Is ES looking for anything that begins with the "from" value, in this case
anything that begins with 990 ? and also looking for anything that begins
with 901 ?

My documents are kinda big, see attachment for 1 of them. The rest of the
couple hundred thousand are permutations of that. If you want to test it on
small scale, just create a couple docs with 1 field, a numeric that is
stored as a string, and try to do a range query on it (using my example).

"sourceSize": "9906",
"sourceSize": "9909112",
"sourceSize": "990"

Is what you would get. But I don't "get" how the 1st or 2nd entries
qualify as a match, because even as strings they are longer than my range.
So I don't understand why they match. I might understand if ES returned

"sourceSize": "992"
"sourceSize": "993"
"sourceSize": "994"

Because in that case, the strings are at least the same length as the
values in the range query (900 and 901) even if the returned matches are
"incorrect" from a numeric point of view.

How do range queries for string values work in ES?

-Tim


(system) #7