Range Filter slower then no range query (full scan)


(zuhaib-2) #1

Hey,

So we have a query like the following:

{
"sort": {
"date.untouched": "desc"
},
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"range": {
"_cache": false,
"date.untouched": {
"gte": "2012-10-16T18:33:42Z",
"lte": "2013-11-10T05:31:15Z"
}
}
},
[
{
"term": {
"_cache": false,
"foo.id": 1
}
},
{
"term": {
"bar.id": 1,
"_cache": false
}
}
]
]
}
}
}
},
"size": 75
}

and we are seeing responses back that take something like this:

{
"took" : 1334,
"timed_out" : false,
"_shards" : {
"total" : 8,
"successful" : 8,
"failed" : 0
},
"hits" : {
"total" : 8138,
"max_score" : null,
"hits" : [ {
...

Now if I drop the range query:
{
"sort": {
"date.untouched": "desc"
},
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
[
{
"term": {
"_cache": false,
"foo.id": 1
}
},
{
"term": {
"bar.id": 1,
"_cache": false
}
}
]
]
}
}
}
},
"size": 75
}

It returns results super fast
{
"took" : 53,
"timed_out" : false,
"_shards" : {
"total" : 8,
"successful" : 8,
"failed" : 0
},
"hits" : {
"total" : 7257,
"max_score" : null,
"hits" : [ {
...

Now what are we doing wrong with the range filter or is this just the
expected performance with it. A little info on our setup, this is against
a single index (for the testing I used different index each time to prevent
hitting cache results) and we have filter cache disabled because of the
heap memory needed for it. We have enabled filter cache in the past (for
small testing) and it seems to make little difference. We split the index
by time (index-2013.11, index-2013.10 etc) and if the range is smaller (say
just a day) its also pretty quick. But if the range is a large date gap it
seems to slow everything down.

Any idea?

Thanks
Zuhaib

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #2

For large gaps, the numeric range filter usually performs better than the
range filter. Please however note that it works on top of field data, so
Elasticsearch will need to load all your date values into memory.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-numeric-range-filter.html

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(zuhaib-2) #3

Thanks for the suggestion Adrien but a few issues for us, first the range
field for us is a string (its a date but we store it as a string) so we can
use the current numeric range. According to that page they are removing
numeric range and adding a execution option can you can do fielddata. I
wonder if that will work with string ranges? Need to check out 0.90.8 to
confirm but looks like its still not released it.

Thanks again
Zuhaib

On Tue, Nov 26, 2013 at 1:08 AM, Adrien Grand <
adrien.grand@elasticsearch.com> wrote:

For large gaps, the numeric range filter usually performs better than the
range filter. Please however note that it works on top of field data, so
Elasticsearch will need to load all your date values into memory.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-numeric-range-filter.html

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/uqb1jA0kMQw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALWC%3DND-daUf%2BiwBz3f2vpUQe7r-SvdaUvYZEnhPv%3DXvZ7urxQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(zuhaib-2) #4

Just an update, it seems the changes in 0.90.8 are just a renaming of of
numeric_range filter to fielddata and it still has the same constraint (it
must be a numeric). I open a bug ticket that fielddate range filter should
work on non-numeric
type, https://github.com/elasticsearch/elasticsearch/issues/4318

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bf176313-dd3a-48b3-9435-0f27e2ea6a60%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5