Deduplicating using nested query

Jilles_van_Gurp · June 11, 2014, 12:52pm

I have a simple problem where it would be useful to a query like: get me
everything that matches except if field foo is in results of
.

The simple solution is to first do query2, fetch the foo field for all the
results (potentially thousands), stuff it in some hash and generate a
gigantic set with all the values of the foo field and turn that into a bool
filter with a lot of term queries in the not that gets added to query1.

Since I need to support paging through the results, I can't get away with
simply tossing deduplicating the result set in memory.

Is there a more elegant way to do this in elasticsearch?

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0567016-730c-4531-bf8a-84eb50e4579b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Antonio_Augusto_Sant · June 11, 2014, 2:56pm

Don't know for sure, but did you try using a filter query ?

On Wednesday, June 11, 2014 9:52:20 AM UTC-3, Jilles van Gurp wrote:

I have a simple problem where it would be useful to a query like: get me
everything that matches except if field foo is in results of
.

The simple solution is to first do query2, fetch the foo field for all the
results (potentially thousands), stuff it in some hash and generate a
gigantic set with all the values of the foo field and turn that into a bool
filter with a lot of term queries in the not that gets added to query1.

Since I need to support paging through the results, I can't get away with
simply tossing deduplicating the result set in memory.

Is there a more elegant way to do this in elasticsearch?

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/69009623-e0d9-406f-8217-a2e5c26a2515%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Help with aggregation to identify dups Elasticsearch	3	1079	March 4, 2019
Find Duplicate records in data Elasticsearch	7	18665	July 5, 2017
Distinct results by field for a given query Elasticsearch	5	950	July 6, 2017
Deduplication filter? Elasticsearch	4	4788	July 6, 2017
7.x How do remove duplicates in search after query? Elasticsearch	2	500	February 2, 2021

Deduplicating using nested query

Related topics