Bad response time when using date ranges

I'm having a collection with 8M documents. I want to retrieve last 20 days
of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200 results.
And when i search only by seller_id it takes 3ms. How can I improve this
query? Any best practice on date fields? (It's in dateTime format and with
the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20 days
of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200 results.
And when i search only by seller_id it takes 3ms. How can I improve this
query? Any best practice on date fields? (It's in dateTime format and with
the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com <javascript:_e({},
'cvml', 'elasticsearch%2Bunsubscribe@googlegroups.com');>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

WOW, that's awesome. Just to understand it better, why does this rounding
improves performance?

2013/8/28 Matt Weber matt@mattweber.org

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Without rounding your "now" includes milliseconds so it will never be able
to be reused from the cache (since the ms have changed). When you round,
the filter has better chances of being reused resulting in huge performance
improvements.

Thanks,
Matt Weber

On Wed, Aug 28, 2013 at 8:38 AM, Matías Waisgold mwaisgold@gmail.comwrote:

WOW, that's awesome. Just to understand it better, why does this rounding
improves performance?

2013/8/28 Matt Weber matt@mattweber.org

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Great, thank's!

2013/8/28 Matt Weber matt.weber@gmail.com

Without rounding your "now" includes milliseconds so it will never be able
to be reused from the cache (since the ms have changed). When you round,
the filter has better chances of being reused resulting in huge performance
improvements.

Thanks,
Matt Weber

On Wed, Aug 28, 2013 at 8:38 AM, Matías Waisgold mwaisgold@gmail.comwrote:

WOW, that's awesome. Just to understand it better, why does this rounding
improves performance?

2013/8/28 Matt Weber matt@mattweber.org

Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.

On Wednesday, August 28, 2013, Matías Waisgold wrote:

I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:

{
"query": {
"term": {
"seller_id": 83013710
}
},
"filter": {
"range": {
"date_created": {
"from": "now-20d"
}
}
},
"size": 50
}

The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)

Kind regards.
Matías

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Thanks,
Matt Weber

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just to add, a numeric filter range on a simple int or long field is very
fast.

http://www.elasticsearch.org/guide/reference/query-dsl/numeric-range-filter/

In many cases, when doing date ranges, you never need milliseconds (or even
seconds), so you should discretize the time stamps to coarser int values
and change the field type respectively.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.