Is there any way to remove duplicated search result in ES?

Thank you for your rapid reply .

it is true that i can custom my own search action, but i can not override
the default search action .so, it is not what i want.
at indexing time , there are serval listeners to install plugins, but at
searching time there is hardly any listener to extend the search operation
except the search action .
why not provide a opportunity to install my own plugin to extend the search
phase , because it seems to be simple from the source code .

i should give up the solution using the lucene duplicate filter according
to your answer .

it is very useful of your proposal to solve my problem .I will try it
.thank you very much !

在 2014年1月20日星期一UTC+8下午6时45分40秒,Jörg Prante写道:

It is not true "there is no chance to install my plugin after ES have
collected all search result". You can implement a plugin with an
alternative search action. The issue you have cited is related to
overriding default actions and there is good reason in not allowing that.

The Lucene DuplicateFIlter works on segment level and is not suitable for
index level and not for distributed search.

The basic idea is, if you want the "newest one" of documents, you can sort
docs by timestamp, and pick the first one, ignoring the followers.

You can use aggregations plus filtered queries to issue a series of
queries against an ES index and deduplicate it at client side, using your
custom rules of ordering (e.g. one bucket per author, and pick at most one
doc per author from sorted timestamped result set of a filtered query).
Note, this procedure is very expensive, and does not scale.

The best method is indexing deduplicated data, which is the most preferred
solution, because it is cheap: fetch the list of docs per author from the
original source and index only the one to want to have in search results.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f3cc26de-2c06-4573-b8e5-61ede607b19e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.