I am evaluating to migrate our old search system using lucene to elastic
search and I want to do a query/filter on a value of my document that
depends on current time and multiple fields.
Let me describe by example:
Document: Product
Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date
The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ?
sale_price : list_price
Because "NOW" is changing, we cannot index the value as a stored field.
But I know we can do query / filter using script field to dynamically
calculate the value. However the performance should be bad because it needs
to calculate the value for all documents in the index.
More background:
in our old days, we just index the calculated value and have another daily
cron to find and reindex all documents which have the calculated values
"expired" (because the sale_price_start and sale_price_end already expires)
so now I just want to see if there is any advice to model this problem in
elasticsearch.
Your previous solution is still the best and perfectly applicable.
Sent from my iPhone
On May 29, 2012, at 7:27 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:
Hi,
I am evaluating to migrate our old search system using lucene to Elasticsearch and I want to do a query/filter on a value of my document that depends on current time and multiple fields.
Let me describe by example:
Document: Product
Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date
The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ? sale_price : list_price
Because "NOW" is changing, we cannot index the value as a stored field.
But I know we can do query / filter using script field to dynamically calculate the value. However the performance should be bad because it needs to calculate the value for all documents in the index.
More background:
in our old days, we just index the calculated value and have another daily cron to find and reindex all documents which have the calculated values "expired" (because the sale_price_start and sale_price_end already expires)
so now I just want to see if there is any advice to model this problem in elasticsearch.
If you have some search criteria based on price: filter(price), you can
formulate you search query like this:
(
(sale_price != null AND sale_price_start <= NOW AND NOW <=
sale_price_end) AND filter(sale_price)
)
OR
(
(sale_price == null OR sale_price_start > NOW OR sale_price_end < NOW)
AND filter(list_price)
)
That's not valid elasticsearch synax, but, hopefully, it demonstrates the
idea.
On Tuesday, May 29, 2012 11:47:59 PM UTC-4, RKM wrote:
Your previous solution is still the best and perfectly applicable.
Sent from my iPhone
On May 29, 2012, at 7:27 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:
Hi,
I am evaluating to migrate our old search system using lucene to elastic
search and I want to do a query/filter on a value of my document that
depends on current time and multiple fields.
Let me describe by example:
Document: Product
Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date
The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ?
sale_price : list_price
Because "NOW" is changing, we cannot index the value as a stored field.
But I know we can do query / filter using script field to dynamically
calculate the value. However the performance should be bad because it needs
to calculate the value for all documents in the index.
More background:
in our old days, we just index the calculated value and have another
daily cron to find and reindex all documents which have the calculated
values "expired" (because the sale_price_start and sale_price_end already
expires)
so now I just want to see if there is any advice to model this problem
in elasticsearch.
To better illustrate my idea, I am trying hard to build the "price"
range filters and sorting similar to endless.com:
Now, I am able to build the query/filter based on multiple
combinations of OR/AND/MISSING/EXIST filters as hinted by Igor Motov.
But when I go to build the range facet and sorting on the "calculated"
price, I am stuck.
So I am back to think about how to effectively reindex the products
when sale period expires as suggested by Randy, and I find that the
update API can quickly mass update many records using scripts
conveniently and I think I will go in this way. And if there is any
better way, free feel to share, many thanks.
Wing
On Fri, Jun 1, 2012 at 1:49 AM, Igor Motov imotov@gmail.com wrote:
If you have some search criteria based on price: filter(price), you can
formulate you search query like this:
(
(sale_price != null AND sale_price_start <= NOW AND NOW <=
sale_price_end) AND filter(sale_price)
)
OR
(
(sale_price == null OR sale_price_start > NOW OR sale_price_end < NOW) AND
filter(list_price)
)
That's not valid elasticsearch synax, but, hopefully, it demonstrates the
idea.
On Tuesday, May 29, 2012 11:47:59 PM UTC-4, RKM wrote:
Your previous solution is still the best and perfectly applicable.
Sent from my iPhone
On May 29, 2012, at 7:27 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:
Hi,
I am evaluating to migrate our old search system using lucene to elastic
search and I want to do a query/filter on a value of my document that
depends on current time and multiple fields.
Let me describe by example:
Document: Product
Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date
The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ?
sale_price : list_price
Because "NOW" is changing, we cannot index the value as a stored field.
But I know we can do query / filter using script field to dynamically
calculate the value. However the performance should be bad because it needs
to calculate the value for all documents in the index.
More background:
in our old days, we just index the calculated value and have another
daily cron to find and reindex all documents which have the calculated
values "expired" (because the sale_price_start and sale_price_end already
expires)
so now I just want to see if there is any advice to model this problem
in elasticsearch.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.