Now, I want to get all documents, where "at least X elements of a between Y and Z".
So, this example document will match "1 element between 1 and 9" and "2 elements between 1 and 9", but won't match "3 elements between 1 and 9".
I think I should use Range filter first, but what shoud I do next? What filter shoud I use to do this?
Elasticsearch doesn't support running such a query directly. You could potentially do it if you stored every value in a different field, using a bool query and min_should_match. This might however not work well for you depending how many values a document may have at most.
I'll try to explain the real task a little bit more. We've got a 'users' index with collection of 'user'. Users can perform 'event'. (there are many event types: event1, event2, ...).
The real task is to find users, who perform event A at least B times between time C and D.
In my first attempt, I've tried to save event timestamps as array for each event type (integers are timestamps) at index 'users'.
We already have users index (about 20 millions of users) and about 500 millions of events in our system right now. I don't want to create another index 'events' with 500 millions of records just for this new query.
Time window usually in days, so I thought about data structure something like this for optimization:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.