Equivalent to SQL window functions

Is there any way to implement SQL window functions?

Here is my use case:
Multiple events reported with "Type" and "Timestamp" fields. I'm looking for a way to extract the latest occurrence of each type.

In SQL this can be easily achieved:
This is how my data looks like:

..and this is the desired output:

The following query achieves just that:

SELECT [Type],[Description] FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY [Type] ORDER BY [Timestamp] DESC) AS [RowNumber],
[Type],
[Description]
FROM [test]
) T
WHERE T.[RowNumber] = 1

Yeah, I know Elastic is not a relational database. However, I was hoping this is feasible and I'm just unable to find out how...

1 Like

It looks like you are looking for the latest of each type. If yes, in this case do something similar to the link below and use "max" instead of "avg" on "timestamp" field.

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html

From my experience with ES, sometimes it requires a two-step process to get what you want.

Thanks for the prompt response!
I believe this approach will work. Although the best I can do is 3 queries and complex logic to build them on the fly:

  1. Fetch all the different types
  2. Construct a query to fetch max timestamps for all those types
  3. Construct a query to fetch the details (e.g. description) for given types & timestamps

Can you think of a better/shorter way?

I think 1 and 2 can be done in one query.

I should point out that pipeline aggregations offer somewhat window function like functionality. I haven't read the rest of the post because it looks like you have it covered so I'm not sure if they fit what you want. But they are a thing.

Thanks guys, it looks like I don't have a choice but to implement my own logic to fetch this.
Hoped I'll get away with just one query to avoid implementing a middle tier...

You can solve this using field collapsing with inner-hits.

{
  "collapse": {
    "field": "type",
    "inner_hits": {
      "name": "order by timestamp",
      "size": 1,
      "sort": [
        {
          "timestamp": "desc"
        }
      ]
    }
  }
}

You need to ensure that the type field is "single-valued keyword"

For more information with detailed example - https://blog.francium.tech/sql-window-function-partition-by-in-elasticsearch-c2e3941495b6
ES Documentation on Field collapsing - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-collapse.html

1 Like