[Ingest Pipeline] Drop fields if duplicate

Fernando_Ruben_Otero · May 29, 2020, 6:41pm

Is it possible to drop a document if there's another document with the same _source, or with the same subset of fields from _source?

    "processors" : [
    {
      "drop": {
        "if": "ctx._source == documents[ctx._id]._source"
       }
    }
  ]

Christian_Dahlqvist · May 31, 2020, 3:29pm

The processors generally work within the context of a simgle document, so do not have access to other documents already in the index. If you are looking to avoid duplicates you can do this by assigning a predictable ID that will cause an update when the duplicate arrives. This is described in this old blog post.

system · June 28, 2020, 3:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Delete all docs that have duplicate field values Elasticsearch	5	435	March 10, 2022
Duplicate Deletion in Elasticsearch 2.X Elasticsearch	2	595	July 25, 2017
Wtacher or Ingest pipeline avoiding duplicates Elasticsearch	1	159	November 22, 2023
Remove documents with existing value in given field Elasticsearch	5	291	July 6, 2022
Prevent new document with the same docuemnt id replacing the old one Logstash	5	524	March 1, 2018

[Ingest Pipeline] Drop fields if duplicate

Related topics