Scripted metric not working inside transform

Hi, I have an index which contains events as documents and I'm trying to aggregate them by sessionId. I created the following transform and so far I have been able to generate the startTime, endTime, and sessionDuration, along with term aggregation for the event field using the following JSON -

{
  "group_by": {
    "sessionId": {
      "terms": {
        "field": "sessionId"
      }
    }
  },
  "aggregations": {
    "events": {
      "terms": {
        "field": "event",
        "size": 10
      }
    },
    "endTime": {
      "max": {
        "field": "timestamp"
      }
    },
    "startTime": {
      "min": {
        "field": "timestamp"
      }
    },
    "sessionDuration": {
      "bucket_script": {
        "buckets_path": {
          "start": "startTime",
          "end": "endTime"
        },
        "script": "((params.end - params.start)/1000)"
      }
    }
  }
}

But here I'm only able to get term aggregation of events. But I also want to know exactly what events were fired in a session, and their order.
So, I created the following search query to get a list of all the events in an index - which returns all the events for the entire index as a list, which is exactly what I need.

POST test30/_search?size=0
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "eventFlow": {
      "scripted_metric": {
        "init_script": "state.allEvents = [];", 
        "map_script": "state.allEvents.add(doc.event.value)",
        "combine_script": "return state.allEvents",
        "reduce_script": "List newAllEvents = new ArrayList(); for (a in states){ newAllEvents.add(a)} return newAllEvents"
      }
    }
  }
}

So I added the above scripted metric query to the transform like this -

{
  "group_by": {
    "sessionId": {
      "terms": {
        "field": "sessionId"
      }
    }
  },
  "aggregations": {
    "events": {
      "terms": {
        "field": "event",
        "size": 10
      }
    },
    "endTime": {
      "max": {
        "field": "timestamp"
      }
    },
    "startTime": {
      "min": {
        "field": "timestamp"
      }
    },
    "sessionDuration": {
      "bucket_script": {
        "buckets_path": {
          "start": "startTime",
          "end": "endTime"
        },
        "script": "((params.end - params.start)/1000)"
      }
    },
    "eventFlow": {
      "scripted_metric": {
        "init_script": "state.allEvents = [];",
        "map_script": "state.allEvents.add(doc.event.value)",
        "combine_script": "return state.allEvents",
        "reduce_script": "List newAllEvents = new ArrayList(); for (a in states){ newAllEvents.add(a)} return newAllEvents"
      }
    }
  }
}

But I don't see any new field created in the preview section of the transform. Am I missing something here?

Can you post some example documents, so I can reproduce the isse?. I don't see anything wrong, adding. One more aggregation should not break _preview.

Regarding your script:

The for loop iterates over the shard results which are lists, By adding the list into a list, you get a list nested in your newAllEvents list. I guess you results contains:

    "eventFlow": {
      "value": [
        [

Once you have several indices or shards, the result will have several lists. I think you should use newAllEvents.addAll(a) in your script, to avoid the extra [.

I've tried it with your test data. It worked for me.

Did you use the UI or dev console? The UI has indeed a problem to show the preview.

To run a preview in dev console:

POST _transform/_preview
{
  "source": {
    "index": "YOUR_INDEX"
  },
  "pivot": {
...
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.