Discuss the Elastic Stack

Bad response time when using date ranges

Elastic Stack Elasticsearch

Matias_Waisgold (Matías Waisgold) August 28, 2013, 2:01pm 1

I'm having a collection with 8M documents. I want to retrieve last 20 days
of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200 results.
And when i search only by seller_id it takes 3ms. How can I improve this
query? Any best practice on date fields? (It's in dateTime format and with
the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mattweber (Matt Weber) August 28, 2013, 2:42pm 2

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20 days
of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200 results.
And when i search only by seller_id it takes 3ms. How can I improve this
query? Any best practice on date fields? (It's in dateTime format and with
the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com <javascript:_e({},
'cvml', 'elasticsearch%2Bunsubscribe@googlegroups.com');>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Matias_Waisgold (Matías Waisgold) August 28, 2013, 3:38pm 3

WOW, that's awesome. Just to understand it better, why does this rounding
improves performance?

2013/8/28 Matt Weber matt@mattweber.org

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mattweber (Matt Weber) August 28, 2013, 3:50pm 4

Without rounding your "now" includes milliseconds so it will never be able
to be reused from the cache (since the ms have changed). When you round,
the filter has better chances of being reused resulting in huge performance
improvements.

Thanks,
Matt Weber

On Wed, Aug 28, 2013 at 8:38 AM, Matías Waisgold mwaisgold@gmail.comwrote:

WOW, that's awesome. Just to understand it better, why does this rounding
improves performance?

2013/8/28 Matt Weber matt@mattweber.org

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Matias_Waisgold (Matías Waisgold) August 29, 2013, 3:34am 5

Great, thank's!

2013/8/28 Matt Weber matt.weber@gmail.com

Without rounding your "now" includes milliseconds so it will never be able
to be reused from the cache (since the ms have changed). When you round,
the filter has better chances of being reused resulting in huge performance
improvements.

Thanks,
Matt Weber

On Wed, Aug 28, 2013 at 8:38 AM, Matías Waisgold mwaisgold@gmail.comwrote:

WOW, that's awesome. Just to understand it better, why does this rounding
improves performance?

2013/8/28 Matt Weber matt@mattweber.org

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante (Jörg Prante) August 29, 2013, 5:26am 6

Just to add, a numeric filter range on a simple int or long field is very
fast.

http://www.elasticsearch.org/guide/reference/query-dsl/numeric-range-filter/

In many cases, when doing date ranges, you never need milliseconds (or even
seconds), so you should discretize the time stamps to coarser int values
and change the field type respectively.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views	Activity
Range query problems Elasticsearch	2	346	July 6, 2017
Question on ES query optimization Elastic Search	3	42	October 24, 2024
Poor performance with date range Elasticsearch	3	618	July 6, 2017
Elastic Search Slow Response time range filter query Elasticsearch	3	673	November 16, 2022
Slow filter execution Elasticsearch	8	1936	July 6, 2017

© 2020. All Rights Reserved - Elasticsearch

Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. and in other countries
Trademarks
Terms
Privacy
Brand
Code of Conduct

Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.