ES|QL: How to count latest status per key without transform

Hello Team,

I am trying to compute metrics based only on the latest record per key using ES|QL in 9.x version?
I have a time-series index where each entity (e.g. trainnumber) emits multiple events over time and I want to:
Consider only the latest record per trainnumber (by @timestamp)
Then compute counts based on the latest station value

Example :

trainnumber | @timestamp | station
------------+--------------------------+---------
T100 | 2026-01-05T10:00:00.000Z | START
T100 | 2026-01-05T10:05:00.000Z | MID
T100 | 2026-01-05T10:10:00.000Z | END
T200 | 2026-01-05T11:00:00.000Z | START
T200 | 2026-01-05T11:07:00.000Z | MID
T300 | 2026-01-05T12:00:00.000Z | START

Latest Status :

T100 → END
T200 → MID
T300 → START

Expected Metric :

So START = 1 , MID = 1 , END = 1

I know one way is creating a transform index which will keep the latest record & on that index we can easily have the Metric for the station.

Wanted to understand if there is any other way to avoid creating transform & new index.

In the couple of years I've been active on the forum, this type of Q has come up pretty often, and it's always been a bit awkward to answer. The TOP function, which will be enhanced soon with an OutputField, has close to the functionality you need.

What's seems to be missing, and will still be missing, is support for an outputField of keyword type. If your train number was actually a number then I think TOP that might be part of 9.3 will do the job.

https://www.elastic.co/docs/reference/query-languages/esql/functions-operators/aggregation-functions#esql-top

EDIT: maybe actually better if the station were a number. With latest SNAPSHOT (*) I see:

FROM trains | STATS station=TOP(@timestamp, 1,"DESC",stationnumber) BY trainnumber | STATS train_count=COUNT(station) BY station | SORT train_count ASC

seems to do the job, given this data loaded into trains index

trainnumber,@timestamp,stationnumber
T500,2026-01-05T09:00:00.000Z,1
T100,2026-01-05T10:00:00.000Z,1
T100,2026-01-05T10:05:00.000Z,2
T400,2026-01-05T10:05:00.000Z,1
T100,2026-01-05T10:10:00.000Z,3
T200,2026-01-05T11:00:00.000Z,1
T200,2026-01-05T11:07:00.000Z,2
T300,2026-01-05T12:00:00.000Z,1
T400,2026-01-05T12:05:00.000Z,2
T400,2026-01-05T12:10:00.000Z,3
T500,2026-01-05T12:15:00.000Z,2

implcitly 1==START, 2==MID, 3==END

That all said, these types of common questions could be wayyy simpler to answer in ES|QL if the TOP function could be extended further with keyword fields as OutputField.

(*) snapshots are not released. versions!

1 Like