Conditional filtering of documents based on other documents


(Mahmoud Alnahlawi) #1

I have the following use case...

Document has three fields:
id: Int
name: String
environment: Int

The _id of the document is a composite id and environment.

Here's sample data

id Name environment _id
1 one one 1 1_1
1 one two 2 1_2
3 three one 1 3_1

Now, I want to issue a query that give me a single document (id) per
environment. If a document is in multiple environments, I want to pick the
one with the highest environment number.

The solution, further more, needs to play nice with pagination.

The way I would do that with solr is using field collapsing. I would group
the results by id (and then return the one with the highest environment).
The number of hits is the number of groups (unique).

I considered three approaches but don't really like either.

First is to sort the results by the id and then dedup them in the client.
Pagination would be hairy to work out and total number of hits would not
be straight forward to get.

Second, Facet on the id. Not quite sure how this would work given that,
based on my reading, that I can only return some sort of a count from the
facts.

Issue multiple queries (seem like it's too much of brute force).

Ideas? Extra points if the solution works with ES 0.9.10

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3fea6b74-37db-40f9-80f8-3c71650c1182%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2