I am looking for a way which can remove the duplicated search result in ES,
I am eager to anybody's help.
first, i want to explain the requirement. I have created indexs for three
documents, each index have the unique primary key and the same docid. Such
documents may be published by the same author at different time . if i
search the related documents from ES, i will get three documents, but i
only want the newest one. I need to remove other duplicated documents.
I want to develop custom plugin to implement the requirement ,but finally i
failed , because there is no chance to install my plugin after ES have
collected all search result . Does anyone encountered the same problem？
Some people have met the same problem from the following link.
There is a duplicate filter called DuplicateFilter in lucene, which can
remove duplicate values from search result. Maybe, I can use this filter to
remove the articles having the same author .
Please see the following link.
but the lucene filter can not used in ES directly .Some people have met the
same problem , and kimchy have given the solution . please take a look at
the following link.
some people also want to use DuplicateFilter in ES, and have asked kimchy
for help. The following link show the detail .
so, we may have the solution to solve our problem , but it is not the best
one according to kimchy's opinion .
in a word , any of above way is not the perfect solution, does anybody met
the same problem ?
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f21b7fa0-1cd1-4aae-87fa-93fe463f39cc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.