I am quite new to elasticsearch and have a design/query question. We have a lot of classifieds that we store and query via elasticsearch. Each contain some product data, a visibility date-range (when did it appear and disappear) and a price.
We want to to query the number of classifieds visible per week (lets say last 12 months) and the average price for the classifieds visible in each week.
We tried the approach to create buckets for the start-date and then subbuckets over the end-date. Then we aggregate the results to get the count/week and avg/week. But we always have to go over the full index to get all relevant classifieds and I am not sure if that scales.
The alternative we are thinking about is to store an array of weeks where the classified is visible and then aggregate over this array. This seems to work fine but we have to store some extra data.
Is it possible to store a period and use it to aggregate?
Hi Marcus.
Are you using the range field?
While this supports search on these values, aggregations are still an ongoing piece of work.
No, we have two attributes with start and end-date. But looks like it would be smarter to use daterange? Would it be possible to aggregate documents over time-buckets and include all classifieds where the daterange includes a timestamp that the bucket represents?
It's more convenient and likely performant for queries such as "get all ads active between X and Y dates".
However aggregations like date histograms don't yet support routing of range field values into the various buckets - that development work is what was in the "ongoing work" link I shared.
In the interim your array-of-dates approach sounds like what you would need to make aggregations like date histograms work.
We will go with the list of weeks where the classified were visible. Thanx alot for your help!
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.