Transform API update documents already inserted

Hey I am facing a big issue as we are going to production using Transform API. What we are facing is the aggregation calculated with a transform API are updated once a term aggregation already exist in the destination index.

Here is the transformation:

POST _transform/_preview
{
  "source": {
    "index": "sourceIndex",
    "query": {
      "bool": {
        "filter": [
          {
            "range": {
              "@timestamp": {
                "gte": "now/d-7d-7d",
                "lt": "now/d-7d"
              }
            }
          }
        ]
      }
    }
  },
  "dest": {
    "index": "destinationIndex",
   "pipeline": "add_ingestedAt"
  },
  "pivot": {
    "group_by": {
      "os": { "terms": { "field":"infos.os"}}
    },
    "aggs": {
      "mac_unique": {
        "cardinality": {
          "field": "mac_adress"
        }
      }
    }
  }
}

The following response is:

"preview" : [
    {
      "os" : "os1",
      "mac_unique" : 9,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os2",
      "mac_unique" : 3,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os3",
      "mac_unique" : 3,
      "ingestedAt": "2021-10-19T12:34:25Z"
    }
  ]
....

The problem is, when I insert a new document, let's say having os=os3, what I expect is having a new insertion in the destination index as follow:

"preview" : [
    {
      "os" : "os1",
      "mac_unique" : 9,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os2",
      "mac_unique" : 3,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os3",
      "mac_unique" : 3,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os3",
      "mac_unique" : 1,
      "ingestedAt": "2021-10-20T12:34:25Z"
    },
  ]
....

But it gets updated like this:

"preview" : [
    {
      "os" : "os1",
      "mac_unique" : 9,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os2",
      "mac_unique" : 3,
      "ingestedAt": "2021-10-19T12:34:25Z"
    },
    {
      "os" : "os3",
      "mac_unique" : 4,
      "ingestedAt": "2021-10-20T12:34:25Z"
    }
  ]
....

Why does this happens as we expect a new insertion as a raw data in the destination index? any workaround/suggestion please ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.