Cannot start transform job - Cannot set after key in the composite aggregation [_transform]

Hi,

We would like to setup transform job to create statistics - how many devices are connected per hour. In our code we have added for every request, deviceId and clientId to labels data [labels.deviceId, labels.clientId].

We have created index template:

PUT _index_template/test_apm_devices_visits_per_hour
{
  "template": {
    "settings": {
      "number_of_shards": 10,
      "number_of_replicas": 1
    },
    "mappings": {
      "properties": {
        "labels.deviceId": {
          "type": "keyword"
        },
        "labels.clientId": {
          "type": "keyword"
        },
        "@timestamp": {
          "type": "date"
        },
        "visits": {
          "type": "long"
        }
      }
    }
  },
  "index_patterns": [
    "test_apm_devices_visits_per_hour-*"
  ]
}

We have created the tansform:

PUT _transform/test_devices-visits-per-hour
{
  "source": {
    "index": "apm-*",
    "query": {
      "bool": {
        "must": [
          {
            "range": {
              "@timestamp": {
                "gte": "now-1d/d"
              }
            }
          },
          {
            "wildcard": {
              "service.name": {
                "value": "prefix-for-service-name*"
              }
            }
          }
        ]
      }
    }
  },
  "dest": {
    "index": "test_apm_devices_visits_per_hour"
  },
  "pivot": {

    "group_by": {
      "labels.deviceId": {
        "terms": {
          "field": "labels.deviceId"
        }
      },
      "labels.clientId": {
        "terms": {
          "field": "labels.clientId"
        }
      },
      "@timestamp": {
        "date_histogram": {
          "field": "@timestamp",
          "calendar_interval": "1h"
        }
      },
      "service.name": {
        "terms": {
          "field": "service.name"
        }
      }

    },
    "aggregations": {
      "visits": {
        "value_count": {
          "field": "labels.deviceId"
        }
      }
    }
  },
  "frequency": "30m",
  "sync": {
    "time": {
      "field": "@timestamp",
      "delay": "1m"
    }
  },
   "settings": {
    "max_page_search_size": 10000
  }
}

Creating index from template:

PUT test_apm_devices_visits_per_hour

Run transformation:

POST _transform/test_devices-visits-per-hour/_start

Immediately after start, we received exception.

task encountered irrecoverable failure: ElasticsearchParseException[Cannot set after key in the composite aggregation [_transform] - incompatible value in the position 0: invalid value, expected string, got Double]; nested: IllegalArgumentException[incompatible value in the position 0: invalid value, expected string, got Double]; nested: IllegalArgumentException[invalid value, expected string, got Double];; java.lang.IllegalArgumentException: incompatible value in the position 0: invalid value, expected string, got Double

Any help? Thanks in advance

When a pivot transform starts, it will create a destination index based on deduced mappings. It will use mappings deduced from its config and the source destination fields. e.g. sum(half_float) will be mapped to a float. It does not use the template or dynamic mappings (except for scripted fields).

By using transform _preview you can see the expected generated_dest_index mappings. I would recommend trying this as the next troubleshooting step.

If you wish to avoid mapping deduction, then create the empty destination index before starting the transform.

Hope this helps.

Hi Sophie,

I have tried preview of transform. This is part of the response:

...
 {
      "visits" : 63,
      "@timestamp" : "2021-09-02T09:00:00.000Z",
      "service" : {
        "name" : "my-service-name"
      },
      "labels" : {
        "clientId" : 3.9746313E7,
        "deviceId" : 492060.0
      }
    }
  ],
  "generated_dest_index" : {
    "mappings" : {
      "_meta" : {
        "_transform" : {
          "transform" : "transform-preview",
          "version" : {
            "created" : "7.14.0"
          },
          "creation_date_in_millis" : 1630592219862
        },
        "created_by" : "transform"
      },
      "properties" : {
        "visits" : {
          "type" : "long"
        },
        "@timestamp" : {
          "type" : "date"
        },
        "service.name" : {
          "type" : "keyword"
        },
        "labels.clientId" : {
          "type" : "scaled_float"
        },
        "service" : {
          "type" : "object"
        },
        "labels.deviceId" : {
          "type" : "scaled_float"
        },
        "labels" : {
          "type" : "object"
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "1",
        "auto_expand_replicas" : "0-1"
      }
    },
    "aliases" : { }
  }

I have changed the index template mapping properties to:

PUT _index_template/apm_devices_visits_per_hour
{
  "template": {
    "settings": {
      "number_of_shards": 10,
      "number_of_replicas": 1
    },
    "mappings": {
      "properties": {
        "visits" : {
          "type" : "long"
        },
        "@timestamp" : {
          "type" : "date"
        },
        "service.name" : {
          "type" : "keyword"
        },
        "labels.clientId" : {
          "type" : "scaled_float",
          "scaling_factor": 10000000
        },
        "service" : {
          "type" : "object"
        },
        "labels.deviceId" : {
          "type" : "scaled_float",
           "scaling_factor": 10000000
        },
        "labels" : {
          "type" : "object"
        }
       
      }
    }
  },
  "index_patterns": [
    "apm_devices_visits_per_hour*"
  ]
}

And also I created destination index.

PUT apm_devices_visits_per_hour

After starting the transform I got the same exception.

task encountered irrecoverable failure: ElasticsearchParseException[Cannot set after key in the composite aggregation [_transform] - incompatible value in the position 0: invalid value, expected string, got Double]; nested: IllegalArgumentException[incompatible value in the position 0: invalid value, expected string, got Double]; nested: IllegalArgumentException[invalid value, expected string, got Double];; java.lang.IllegalArgumentException: incompatible value in the position 0: invalid value, expected string, got Double

Any help? Thanks.

Hey @ciment,

Your exception might have to do with your source index mappings.

I am thinking that your labels.deviceId is mapped as a number in one apm-* index and as a keyword in another

Can you execute

GET apm-*/_field_caps?field=labels.deviceId,labels.clientId,@timestamp,service.name

And verify that the field types are unified across your indices?

Thanks!

Hi @BenTrent,

I verified the field types across my indices. I found that fields (labels.deviceId,labels.clientId) exist only for apm transaction and apm error indices.

I edited my transform and in settings I changed from

...
"index": "apm-*", 
...

to

...
"index": ["apm-*-transaction*","apm-*-error*"],
...

and finally my transform works.

Thanks @BenTrent and @sophie_chang for your advices.