In Elasticsearch I'd like to index items of multiple types. Examples are "blog posts" and "deals". All types in the index should be sorted according to their publish date, with a single exception: deals are valid in a certain time span. The sorting of deals within the complete list should NOT be based on publish time but on the "valid until" timestamp. So the question is, how can I sort on the publish date of the blog post and the "valid until" date of the deals, in a single query? Raw example data
12/06: Blog post 8
11/06: Blog post 7
09/06: Blog post 6
09/06: Deal 2
06/06: Blog post 5
03/06: Blog post 4
29/05: Blog post 3
27/05: Blog post 2
26/05: Deal 1
26/05: Blog post 1
The deal #2 is available from 09/06 - 23/06, deal #1 was available from 26/05 - 09/06 and so on (every x weeks n new deals valid for a T period of time, possible overlap). Expected result
This should be the result of the search:
09/06: Deal 2 // Valid until 23/06, so this should be on top
12/06: Blog post 8
11/06: Blog post 7
09/06: Blog post 6
26/05: Deal 1 // It was valid until 09/06, so place this just before the blog post from 09/06
06/06: Blog post 5
03/06: Blog post 4
29/05: Blog post 3
27/05: Blog post 2
26/05: Blog post 1
To make things more difficult, I have to consider these things as well:
Pagination should work no matter of the paging size (shouln't be a big issue I guess)
More types could be added in the future (shouln't be a big issue as well)
Other types (say, events) could have a same system, where another time span ("promotional period") will be used
All items can be tagged / categorized and it should be able to filter on these tags and use faceted search, all having the same sorting constraints
It's the first time I dive into ES, so I am not sure if it clear to it's just a matter of RTFM, but please give me a nudge in the right direction
If you have two (or more) date fields to sort on, look at "copy_to" mapping
feature to copy them over to a third field e.g. "sort_date". So you have a
single field you can happily to sort on, without having to change fields in
the source.
Same method works for tag/category fields in different indexes that are
meant for facets that can span more than one index.
Jörg
On Thu, Jun 12, 2014 at 3:29 PM, Jurian Sluiman jurian@soflomo.com wrote:
In Elasticsearch I'd like to index items of multiple types. Examples are "blog posts" and "deals". All types in the index should be sorted according to their publish date, with a single exception: deals are valid in a certain time span. The sorting of deals within the complete list should NOT be based on publish time but on the "valid until" timestamp. So the question is, how can I sort on the publish date of the blog post and the "valid until" date of the deals, in a single query? Raw example data
12/06: Blog post 8
11/06: Blog post 7
09/06: Blog post 6
09/06: Deal 2
06/06: Blog post 5
03/06: Blog post 4
29/05: Blog post 3
27/05: Blog post 2
26/05: Deal 1
26/05: Blog post 1
The deal #2 is available from 09/06 - 23/06, deal #1 was available from 26/05 - 09/06 and so on (every x weeks n new deals valid for a T period of time, possible overlap). Expected result
This should be the result of the search:
09/06: Deal 2 // Valid until 23/06, so this should be on top
12/06: Blog post 8
11/06: Blog post 7
09/06: Blog post 6
26/05: Deal 1 // It was valid until 09/06, so place this just before the blog post from 09/06
06/06: Blog post 5
03/06: Blog post 4
29/05: Blog post 3
27/05: Blog post 2
26/05: Blog post 1
To make things more difficult, I have to consider these things as well:
Pagination should work no matter of the paging size (shouln't be a big issue I guess)
More types could be added in the future (shouln't be a big issue as well)
Other types (say, events) could have a same system, where another time span ("promotional period") will be used
All items can be tagged / categorized and it should be able to filter on these tags and use faceted search, all having the same sorting constraints
It's the first time I dive into ES, so I am not sure if it clear to it's just a matter of RTFM, but please give me a nudge in the right direction
I was now testing it out with the _timestamp field which I need to set
manually for each item, but copy_to seems even better. Thanks for the
insights!
Jurian
On Thursday, June 12, 2014 5:28:49 PM UTC+2, Jörg Prante wrote:
If you have two (or more) date fields to sort on, look at "copy_to"
mapping feature to copy them over to a third field e.g. "sort_date". So you
have a single field you can happily to sort on, without having to change
fields in the source.
Same method works for tag/category fields in different indexes that are
meant for facets that can span more than one index.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.