Alternate to multi-terms aggregation in transform

Hello Team ES,

We have a use case where this transform works well for us.-

PUT _transform/tradebystatus
{
  "source": {
    "index": [
      "latest-transactions"
    ]
  },
  "pivot": {
    "group_by": {
      "tradeDateTime": {
        "date_histogram": {
          "field": "tradeDateTime",
          "calendar_interval": "1m"
        }
      },
      "submissionAccountName": {
        "terms": {
          "field": "submissionAccountName.keyword"
        }
      },
      "executingEntityIdCode": {
        "terms": {
          "field": "executingEntityIdCode.keyword"
        }
      },
      "assetClass": {
        "terms": {
          "field": "assetClass.keyword"
        }
      },
      "regulator": {
        "terms": {
          "field": "regulator.keyword"
        }
      }
    },
    "aggregations": {
      "genres_and_products": {
        "multi_terms": {
          "terms": [{
            "field": "status.keyword" 
          }, {
            "field": "assetClass.keyword"
          }]
        }
      }
    }
  },
  "frequency": "10s",
  "dest": {
    "index": "tradebystatus"
  },
  "sync": {
    "time": {
      "delay": "1s",
      "field": "ingest_time"
    }
  },
  "retention_policy": {
    "time": {
      "field": "ingest_time",
      "max_age": "7d"
    }
  }
}

How ever the requirement changed slightly, that any of the field in group by can be updated., so we need to move it from the group by part and move to the aggregation part.

Multi terms aggregations are not allowed in transform. Is there an alternate to this use case?

can you post a data example?

If you look for copying over submissionAccountName, executingEntityIdCode, assetClass, regulator(or a subset of those) from the source into the destination index without grouping them, I suggest to look into top_metrics (you can create 1 top_metric targetting multiple fields at once).

If any of the ootb aggregations do not provide what you are looking for you can use scripted_metric. We have some examples for it in the docs.

Thank you for your reply Hendrik. I shall try the suggested APIs and then come back on this.

hi Hendrik,

I tried your suggestion but I'm not able to get the desired output . For input data -

{
        "_index": "latest-transactions",
        "_id": "NnLULrwn4Me522lTlVzUwIbJAAAAAAAA",
        "_score": 1,
        "_source": {
          "ingest_time": "2022-07-11T13:43:28.554096099Z",
          "transactionReferenceNumber": "ref1",
          "warningDescriptions": "Reported - non-MiFID eligibleeee",
          "submissionAccountName": "ACCOUNT1",
          "warnings": "W6009",
          "assetClass": "EQUI",
          "initial_ingest_time": "2022-07-11T13:43:28.554096099Z",
          "tradeDateTime": "2022-07-11T13:07:39.137029349Z",
          "regulator": "BaFIN",
          "instructionId": "ins1",
          "payload_ts": "1657530982219",
          "executingEntityIdCode": "635400BDQCJNMOGTBB61",
          "status": "NEW"
        }
      },
      {
        "_index": "latest-transactions",
        "_id": "NnJtzRgray9JKJWDQ3aQCIE7AAAAAAAA",
        "_score": 1,
        "_source": {
          "ingest_time": "2022-07-11T13:50:26.865951666Z",
          "transactionReferenceNumber": "ref3",
          "warningDescriptions": "Reported - non-MiFID eligibleeee",
          "submissionAccountName": "ACCOUNT1",
          "warnings": "W6009",
          "assetClass": "EQUI",
          "initial_ingest_time": "2022-07-10T13:35:54.884067700Z",
          "tradeDateTime": "2022-07-10T13:07:39.137029349Z",
          "regulator": "BaFIN",
          "instructionId": "ins3",
          "payload_ts": "1657411200000",
          "executingEntityIdCode": "635400BDQCJNMOGTBB61",
          "status": "NEW-updated"
        }
      },
      {
        "_index": "latest-transactions",
        "_id": "NnIsJ6ZK7KWrrvyEazVThqIcAAAAAAAA",
        "_score": 1,
        "_source": {
          "ingest_time": "2022-07-11T15:25:10.473393438Z",
          "transactionReferenceNumber": "ref2",
          "warningDescriptions": "Reported - non-MiFID eligibleeee",
          "submissionAccountName": "ACCOUNT1",
          "warnings": "W6009",
          "assetClass": "BOND",
          "initial_ingest_time": "2022-07-11T13:43:37.870259331Z",
          "tradeDateTime": "2022-07-11T13:07:39.137029349Z",
          "regulator": "BaFIN",
          "instructionId": "ins2",
          "payload_ts": "1657530982219",
          "executingEntityIdCode": "635400BDQCJNMOGTBB61",
          "status": "NEW-2"
        }
      },
      {
        "_index": "latest-transactions",
        "_id": "NnLvT60H82772ZWc0douckcuAAAAAAAA",
        "_score": 1,
        "_source": {
          "initial_ingest_time": "2022-07-13T12:59:54.278717796Z",
          "ingest_time": "2022-07-13T12:59:54.278717796Z",
          "transactionReferenceNumber": "ref4",
          "submissionAccountName": "ACCOUNT1",
          "tradeDateTime": "2022-07-08T13:07:39.137029349Z",
          "regulator": "BaFIN",
          "instructionId": "ins4",
          "assetClass": "EQUI",
          "payload_ts": "1657717128548",
          "executingEntityIdCode": "635400BDQCJNMOGTBB61",
          "status": "NEW"
        }
      }
    ]

I have this transform -



POST _transform/_preview
{
  "source": {
    "index": "latest-transactions"
  },
  "pivot": {
    "group_by": {
      "tradeDateTime": {
        "date_histogram": {
          "field": "tradeDateTime",
          "calendar_interval": "1m"
        }
      },
      "executingEntityIdCode": {
        "terms": {
          "field": "executingEntityIdCode.keyword"
        }
      }      
    },
    "aggs": {
    "status": {
      "terms": {
        "field": "status.keyword"
      },
      "aggs": {
        "tm": {
          "top_metrics": {
            "metrics": [
              {
                "field": "submissionAccountName.keyword"
              },
              {
                "field": "assetClass.keyword"
              },
              {
                "field": "regulator.keyword"
              },
              {
                "field": "status.keyword"
              }
            ],
            "sort": {
              "ingest_time": "desc"
            }
          }
        }
      }
      
    },
    "status_count": {
          "value_count": {
            "field": "status.keyword"
          }
        }
  }
  }
}

and result is

{
  "preview": [
    {
      "tradeDateTime": "2022-07-08T13:07:00.000Z",
      "status_count": 1,
      "executingEntityIdCode": "635400BDQCJNMOGTBB61",
      "status": {
        "NEW": {
          "tm": {
            "submissionAccountName.keyword": "ACCOUNT1",
            "assetClass.keyword": "EQUI",
            "regulator.keyword": "BaFIN",
            "status.keyword": "NEW"
          }
        }
      }
    },
    {
      "tradeDateTime": "2022-07-10T13:07:00.000Z",
      "status_count": 1,
      "executingEntityIdCode": "635400BDQCJNMOGTBB61",
      "status": {
        "NEW-updated": {
          "tm": {
            "submissionAccountName.keyword": "ACCOUNT1",
            "assetClass.keyword": "EQUI",
            "regulator.keyword": "BaFIN",
            "status.keyword": "NEW-updated"
          }
        }
      }
    },
    {
      "tradeDateTime": "2022-07-11T13:07:00.000Z",
      "status_count": 2,
      "executingEntityIdCode": "635400BDQCJNMOGTBB61",
      "status": {
        "NEW": {
          "tm": {
            "submissionAccountName.keyword": "ACCOUNT1",
            "assetClass.keyword": "EQUI",
            "regulator.keyword": "BaFIN",
            "status.keyword": "NEW"
          }
        },
        "NEW-2": {
          "tm": {
            "submissionAccountName.keyword": "ACCOUNT1",
            "assetClass.keyword": "BOND",
            "regulator.keyword": "BaFIN",
            "status.keyword": "NEW-2"
          }
        }
      }
    }
  ],

Is there a way for me to know the count of unique status values in a map (may be) - like this
status{
New: 1
New-2 : 1
}

I am trying to put this whole output in a sub aggregation and run a scripted metric to traverse the status field and store occurrences in a map but it's failing syntactically.

You can either add another sub aggregation to your terms agg with a value_count aggregation. However the output will be nested in a separate sub field:

"NEW-2": {
  "tm": {
    "submissionAccountName.keyword": "ACCOUNT1",
    "assetClass.keyword": "BOND",
    "regulator.keyword": "BaFIN",
    "status.keyword": "NEW-2"
  },
  "count": 1
}

or if you add the same terms aggregation again, but without a sub aggregation, you get the schema you want:

"status-counts": {
  "New": 1
  "New-2" : 1
}

It is not possible to change the structure in transform, however you can use an ingest pipeline to re-arrange the document structure (rename fields).

Thank you for all the help Hendrik. Much appreciated!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.