The problem is that it's taking about 200ms for retrieve only 200 results.
And when i search only by seller_id it takes 3ms. How can I improve this
query? Any best practice on date fields? (It's in dateTime format and with
the default precision_step)
Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.
On Wednesday, August 28, 2013, Matías Waisgold wrote:
I'm having a collection with 8M documents. I want to retrieve last 20 days
of a "seller_id", i'm doing this query:
The problem is that it's taking about 200ms for retrieve only 200 results.
And when i search only by seller_id it takes 3ms. How can I improve this
query? Any best practice on date fields? (It's in dateTime format and with
the default precision_step)
Kind regards.
Matías
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com <javascript:_e({},
'cvml', 'elasticsearch%2Bunsubscribe@googlegroups.com');>.
For more options, visit https://groups.google.com/groups/opt_out.
Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.
On Wednesday, August 28, 2013, Matías Waisgold wrote:
I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:
The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)
Without rounding your "now" includes milliseconds so it will never be able
to be reused from the cache (since the ms have changed). When you round,
the filter has better chances of being reused resulting in huge performance
improvements.
Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.
On Wednesday, August 28, 2013, Matías Waisgold wrote:
I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:
The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)
Without rounding your "now" includes milliseconds so it will never be able
to be reused from the cache (since the ms have changed). When you round,
the filter has better chances of being reused resulting in huge performance
improvements.
Use a filtered query vs. the outer filter and round your "now" by day
(now/d-20d) so filter can get cached and reused.... If you don't need
scoring, use a constant score query with Boolean filter that has
the term+range filter as must clauses. That should give better
performance.
On Wednesday, August 28, 2013, Matías Waisgold wrote:
I'm having a collection with 8M documents. I want to retrieve last 20
days of a "seller_id", i'm doing this query:
The problem is that it's taking about 200ms for retrieve only 200
results. And when i search only by seller_id it takes 3ms. How can I
improve this query? Any best practice on date fields? (It's in dateTime
format and with the default precision_step)
In many cases, when doing date ranges, you never need milliseconds (or even
seconds), so you should discretize the time stamps to coarser int values
and change the field type respectively.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.