The difference between range query and range filter?

mp2893 · May 2, 2012, 4:39pm

Hi,

I am currently making a lot of queries based on text and dates.

I use bool query that includes a text query and a range query.
Text query and range query are combine with a "must".
For example,
{
"query":{
"bool":{
"must":[
{"text":{"title":"my sample query"}},
{"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}}
]
}
}
}

But I found out that the same goal could be achieve by a filtered
query with a range filter.
For example,
{
"query":{
"filtered":{
"filter":{
"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}
},
"query":{
"text":{"title":"my sample query"}
}
}
}
}

Which one is the better way to go?
I'd appreciate any kind of advice.
Thanks.

Ed

kimchy · May 2, 2012, 4:52pm

Usually, the filtered option is better. The filtered option will cache the
results (in an optimized manner), so if another range filter repeats with
the same ranges, the data will already be available from the cache. The
cache is called the filter cache, and its LRU based with automatic memory
management (by default, 20% from the heap). Node stats / Indices stats can
return you its utilization.

On Wed, May 2, 2012 at 7:39 PM, mp2893 mp2893@gmail.com wrote:

Hi,

I am currently making a lot of queries based on text and dates.

I use bool query that includes a text query and a range query.
Text query and range query are combine with a "must".
For example,
{
"query":{
"bool":{
"must":[
{"text":{"title":"my sample query"}},
{"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}}
]
}
}
}

But I found out that the same goal could be achieve by a filtered
query with a range filter.
For example,
{
"query":{
"filtered":{
"filter":{
"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}
},
"query":{
"text":{"title":"my sample query"}
}
}
}
}

Which one is the better way to go?
I'd appreciate any kind of advice.
Thanks.

Ed

mp2893 · May 3, 2012, 1:41am

Thanks, Shay for the detailed answer.
So, if I understood correctly, basically the only difference between the
two types of query is whether the results are cached or not?
Then if queries are consecutively made with totally non-overlapping date
ranges, the two types of query will give me the same performance?

Regards,
Ed

2012/5/3 Shay Banon kimchy@gmail.com

Usually, the filtered option is better. The filtered option will cache the
results (in an optimized manner), so if another range filter repeats with
the same ranges, the data will already be available from the cache. The
cache is called the filter cache, and its LRU based with automatic memory
management (by default, 20% from the heap). Node stats / Indices stats can
return you its utilization.

On Wed, May 2, 2012 at 7:39 PM, mp2893 mp2893@gmail.com wrote:

Hi,

I am currently making a lot of queries based on text and dates.

I use bool query that includes a text query and a range query.
Text query and range query are combine with a "must".
For example,
{
"query":{
"bool":{
"must":[
{"text":{"title":"my sample query"}},
{"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}}
]
}
}
}

But I found out that the same goal could be achieve by a filtered
query with a range filter.
For example,
{
"query":{
"filtered":{
"filter":{
"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}
},
"query":{
"text":{"title":"my sample query"}
}
}
}
}

Which one is the better way to go?
I'd appreciate any kind of advice.
Thanks.

Ed

Ivan · May 3, 2012, 6:59pm

Filters also do not contribute to the scoring of a document, whereas
additional clauses to a query will.

--
Ivan

On Wed, May 2, 2012 at 6:41 PM, edward choi mp2893@gmail.com wrote:

Thanks, Shay for the detailed answer.
So, if I understood correctly, basically the only difference between the two
types of query is whether the results are cached or not?
Then if queries are consecutively made with totally non-overlapping date
ranges, the two types of query will give me the same performance?

Regards,
Ed

2012/5/3 Shay Banon kimchy@gmail.com

Usually, the filtered option is better. The filtered option will cache the
results (in an optimized manner), so if another range filter repeats with
the same ranges, the data will already be available from the cache. The
cache is called the filter cache, and its LRU based with automatic memory
management (by default, 20% from the heap). Node stats / Indices stats can
return you its utilization.

On Wed, May 2, 2012 at 7:39 PM, mp2893 mp2893@gmail.com wrote:

Hi,

I am currently making a lot of queries based on text and dates.

I use bool query that includes a text query and a range query.
Text query and range query are combine with a "must".
For example,
{
"query":{
"bool":{
"must":[
{"text":{"title":"my sample query"}},
{"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}}
]
}
}
}

But I found out that the same goal could be achieve by a filtered
query with a range filter.
For example,
{
"query":{
"filtered":{
"filter":{
"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}
},
"query":{
"text":{"title":"my sample query"}
}
}
}
}

Which one is the better way to go?
I'd appreciate any kind of advice.
Thanks.

Ed

mp2893 · May 5, 2012, 6:16am

So for strict filtering function, I should use filters rather than combined
queries.
Thanks for the info Ivan.

Best,
Ed

2012/5/4 Ivan Brusic ivan@brusic.com

Filters also do not contribute to the scoring of a document, whereas
additional clauses to a query will.

--
Ivan

On Wed, May 2, 2012 at 6:41 PM, edward choi mp2893@gmail.com wrote:

Thanks, Shay for the detailed answer.
So, if I understood correctly, basically the only difference between the
two
types of query is whether the results are cached or not?
Then if queries are consecutively made with totally non-overlapping date
ranges, the two types of query will give me the same performance?

Regards,
Ed

2012/5/3 Shay Banon kimchy@gmail.com

Usually, the filtered option is better. The filtered option will cache
the
results (in an optimized manner), so if another range filter repeats
with
the same ranges, the data will already be available from the cache. The
cache is called the filter cache, and its LRU based with automatic
memory
management (by default, 20% from the heap). Node stats / Indices stats
can
return you its utilization.

On Wed, May 2, 2012 at 7:39 PM, mp2893 mp2893@gmail.com wrote:

Hi,

I am currently making a lot of queries based on text and dates.

I use bool query that includes a text query and a range query.
Text query and range query are combine with a "must".
For example,
{
"query":{
"bool":{
"must":[
{"text":{"title":"my sample query"}},
{"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}}
]
}
}
}

But I found out that the same goal could be achieve by a filtered
query with a range filter.
For example,
{
"query":{
"filtered":{
"filter":{
"range":{"date":{"gte":"2012-05-01", "lte":"2012-05-05"}}
},
"query":{
"text":{"title":"my sample query"}
}
}
}
}

Which one is the better way to go?
I'd appreciate any kind of advice.
Thanks.

Ed

Topic		Replies	Views
Difference between queries Elasticsearch	6	694	March 2, 2018
Range Filter slower then no range query (full scan) Elasticsearch	4	1419	July 6, 2017
Text query Elasticsearch	3	276	July 6, 2017
Useful examples for the range query (vs. range filter) Elasticsearch	1	631	April 5, 2017
Filtering by date and time as different fields Elasticsearch	10	12003	July 5, 2017

The difference between range query and range filter?

Related topics