How to delete duplicate data base on query on 2.3

Hi Elasticsearch:

Seems delete base on query in 2.3 was removed.
We uploaded the same data multiple time so that's why we need to delete the duplicate data by query.

And because we are using AWS ES , so there is no way to install delete-by-query plugin.

Is there a way to fix the data in elasticsearch 2.3?

Well start with the basic

https://www.elastic.co/guide/en/elasticsearch/plugins/2.3/delete-by-query-plugin-reason.html

Ok now that is done,

2 options, write a query that fetches all the doc id's and then write some code to issue the request to delete each object. curl -XDELETE 'localhost:9200/customer/external/2?pretty'

https://www.elastic.co/guide/en/elasticsearch/reference/2.3/_deleting_documents.html

Or, a slightly easier way

Use logstash to re-index

Logstash (Input elasticsearch old index)   ->  Filter ( If bad doc, drop)  ->  Output (Elasticsearch new index))

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.