Query strategy for stateful event timeseries


(Ragnar Rova) #1

Hello.

I am looking for suggestions on a query strategy for the following data model

Lets say I use ES to index state change events of a widget (alerts). A widget is identified by an unique id and has a state (green/red/manually_acked). Timestamped events are indexed which describe the widget state.

A sample series of events:

[{ widget_id: 17, state: "green" }, { widget_id: 19, state: "red" }, { widget_id: 42, state: "red"}, { widget_id: 17, state: "red" }, { widget_id: 17, state: "green" }, { widget_id: 42, state: "manually_acked" } ]

Lets say I have around 1000 different widgets and they go through hundreds of state changes per day. Some widgets also repeatedly report they are in green state over and over.

In the example above, only widget 19 which is red when the event stream ends is the wanted result.

How to I efficiently perform a realtime query of the current state of all widgets which are in red state (not green or manually acked), which can handle some kind of pagination? (There are many events, which I index into date-partitioned indices).

Events have a timestamp field as well which I omitted, and its the latest state per timestamp per widget that I want.

The ES document id and the widget_id cannot be the same preferrably (please include answers if this hugely improves, then I can reconsider).

I tried to implement this using aggregations, however pagination was not possible over aggregations.

Any suggestions? The real problem is an alerting system where I only want to see open alerts.

I do not want to update existing documents in-place but prefer to store multiple documents per widget state change and aggregate them on query. The query needs to be performant for showing the real-time state of widgets still in red state over many days or weeks of event data.


(Christian Dahlqvist) #2

As I described in this recent post, you might be able to achieve this by creating a separate wedges-centric index and index your data twice. In your current index you store all the individual state changes while you in the widget-centric index store only the current state of each widget, where each change results in an update.


(system) #3