Finding one document of every similar documents group

I have an elasticsearch index with many documents. Every group of these
documents contain similar content but not exactly the same.

What is the query that would return only one document of every similar
documents group?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

How do you define similar? this could be done either by splitting those
groups to different types, or by using MoreLikeThis

On Sun, Apr 28, 2013 at 3:43 PM, Muhammad Adel devadel@gmail.com wrote:

I have an elasticsearch index with many documents. Every group of these
documents contain similar content but not exactly the same.

What is the query that would return only one document of every similar
documents group?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

IMHO, I won't index similar documents or I will filter them on the Client side.

To filter on client side, you will have probably to first add to doc a "hashcode" that is set to same value on each similar docs before indexing.

My 2 cents

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 avr. 2013 à 14:43, Muhammad Adel devadel@gmail.com a écrit :

I have an elasticsearch index with many documents. Every group of these documents contain similar content but not exactly the same.

What is the query that would return only one document of every similar documents group?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.