Current time dependent filter/query


(Wing) #1

Hi,

I am evaluating to migrate our old search system using lucene to elastic
search and I want to do a query/filter on a value of my document that
depends on current time and multiple fields.

Let me describe by example:

Document: Product

Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date

The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ?
sale_price : list_price

Because "NOW" is changing, we cannot index the value as a stored field.

But I know we can do query / filter using script field to dynamically
calculate the value. However the performance should be bad because it needs
to calculate the value for all documents in the index.

More background:
in our old days, we just index the calculated value and have another daily
cron to find and reindex all documents which have the calculated values
"expired" (because the sale_price_start and sale_price_end already expires)

so now I just want to see if there is any advice to model this problem in
elasticsearch.

Thanks,
Wing


(Randall McRee) #2

Your previous solution is still the best and perfectly applicable.

Sent from my iPhone

On May 29, 2012, at 7:27 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

Hi,

I am evaluating to migrate our old search system using lucene to elastic search and I want to do a query/filter on a value of my document that depends on current time and multiple fields.

Let me describe by example:

Document: Product

Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date

The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ? sale_price : list_price

Because "NOW" is changing, we cannot index the value as a stored field.

But I know we can do query / filter using script field to dynamically calculate the value. However the performance should be bad because it needs to calculate the value for all documents in the index.

More background:
in our old days, we just index the calculated value and have another daily cron to find and reindex all documents which have the calculated values "expired" (because the sale_price_start and sale_price_end already expires)

so now I just want to see if there is any advice to model this problem in elasticsearch.

Thanks,
Wing


(Igor Motov) #3

If you have some search criteria based on price: filter(price), you can
formulate you search query like this:

(
(sale_price != null AND sale_price_start <= NOW AND NOW <=
sale_price_end) AND filter(sale_price)
)
OR
(
(sale_price == null OR sale_price_start > NOW OR sale_price_end < NOW)
AND filter(list_price)
)

That's not valid elasticsearch synax, but, hopefully, it demonstrates the
idea.

On Tuesday, May 29, 2012 11:47:59 PM UTC-4, RKM wrote:

Your previous solution is still the best and perfectly applicable.

Sent from my iPhone

On May 29, 2012, at 7:27 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

Hi,

I am evaluating to migrate our old search system using lucene to elastic
search and I want to do a query/filter on a value of my document that
depends on current time and multiple fields.

Let me describe by example:

Document: Product

Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date

The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ?
sale_price : list_price

Because "NOW" is changing, we cannot index the value as a stored field.

But I know we can do query / filter using script field to dynamically
calculate the value. However the performance should be bad because it needs
to calculate the value for all documents in the index.

More background:
in our old days, we just index the calculated value and have another
daily cron to find and reindex all documents which have the calculated
values "expired" (because the sale_price_start and sale_price_end already
expires)

so now I just want to see if there is any advice to model this problem
in elasticsearch.

Thanks,
Wing


(Wing) #4

To better illustrate my idea, I am trying hard to build the "price"
range filters and sorting similar to endless.com:

http://www.endless.com/Womens-Shoes-Athletic/b/242170011/ref=topnav_sd_womensath_gw?ie=UTF8&pf_rd_r=09QHCFGPDC0JG5V3HPZK&pf_rd_m=AF16NM0QF9TKW&pf_rd_t=101&pf_rd_i=homepage&pf_rd_p=1360251442&pf_rd_s=headernav#%26priceHigh%3D17096%26node%3D242170011%26nodes%3D242170011%26sort%3D-price%26colors%3D519122011%26heelheights%3D519508011%26ref%3Dsr_nr_all%26pageCode%3Db

Now, I am able to build the query/filter based on multiple
combinations of OR/AND/MISSING/EXIST filters as hinted by Igor Motov.

But when I go to build the range facet and sorting on the "calculated"
price, I am stuck.

So I am back to think about how to effectively reindex the products
when sale period expires as suggested by Randy, and I find that the
update API can quickly mass update many records using scripts
conveniently and I think I will go in this way. And if there is any
better way, free feel to share, many thanks.

Wing

On Fri, Jun 1, 2012 at 1:49 AM, Igor Motov imotov@gmail.com wrote:

If you have some search criteria based on price: filter(price), you can
formulate you search query like this:

(
(sale_price != null AND sale_price_start <= NOW AND NOW <=
sale_price_end) AND filter(sale_price)
)
OR
(
(sale_price == null OR sale_price_start > NOW OR sale_price_end < NOW) AND
filter(list_price)
)

That's not valid elasticsearch synax, but, hopefully, it demonstrates the
idea.

On Tuesday, May 29, 2012 11:47:59 PM UTC-4, RKM wrote:

Your previous solution is still the best and perfectly applicable.

Sent from my iPhone

On May 29, 2012, at 7:27 PM, Yiu Wing TSANG ywtsang@gmail.com wrote:

Hi,

I am evaluating to migrate our old search system using lucene to elastic
search and I want to do a query/filter on a value of my document that
depends on current time and multiple fields.

Let me describe by example:

Document: Product

Fields:
list_price / Double
sale_price / nullable Double
sale_price_start / nullable Date
sale_price_end / nullable Date

The logic/calculation of the "value" that I want to do query/filter is:
sale_price != null && (sale_price_start <= NOW <= sale_price_end) ?
sale_price : list_price

Because "NOW" is changing, we cannot index the value as a stored field.

But I know we can do query / filter using script field to dynamically
calculate the value. However the performance should be bad because it needs
to calculate the value for all documents in the index.

More background:
in our old days, we just index the calculated value and have another
daily cron to find and reindex all documents which have the calculated
values "expired" (because the sale_price_start and sale_price_end already
expires)

so now I just want to see if there is any advice to model this problem
in elasticsearch.

Thanks,
Wing


(system) #5