How to detect status transition in ingested log data

I would like to ask for a high-level advice how to approach the following problem (we have on-premise Elastic 8.8.0).

Periodically every 30 seconds, the following data about status of a resource is ingested into Elastic via Elastic Agent HTTP endpoint > Logstash > Elasticsearch.
The measurement data looks like:

...
{"timestamp": "2023-06-30T19:29:00Z", "id": "resource-id-1", "status": "READY"}
{"timestamp": "2023-06-30T19:29:30Z", "id": "resource-id-1", "status": "READY"}
{"timestamp": "2023-06-30T19:30:00Z", "id": "resource-id-1", "status": "READY"}
{"timestamp": "2023-06-30T19:30:30Z", "id": "resource-id-1", "status": "READY"}
{"timestamp": "2023-06-30T19:31:00Z", "id": "resource-id-1", "status": "READY"}
{"timestamp": "2023-06-30T19:31:30Z", "id": "resource-id-1", "status": "STEADY"}
{"timestamp": "2023-06-30T19:32:00Z", "id": "resource-id-1", "status": "STEADY"}
{"timestamp": "2023-06-30T19:32:30Z", "id": "resource-id-1", "status": "STEADY"}
{"timestamp": "2023-06-30T19:33:00Z", "id": "resource-id-1", "status": "STEADY"}
{"timestamp": "2023-06-30T19:33:30Z", "id": "resource-id-1", "status": "STEADY"}
{"timestamp": "2023-06-30T19:34:00Z", "id": "resource-id-1", "status": "GO"}
{"timestamp": "2023-06-30T19:34:30Z", "id": "resource-id-1", "status": "GO"}
{"timestamp": "2023-06-30T19:35:00Z", "id": "resource-id-1", "status": "GO"}
...

The resource is identified by its id (there are handful of resources with different ids but not shown here for clarity).
My end goal is to compute the duration of the transition of every resource from READY to GO.
That is, for a particular resource, to compute the difference of timestamps between:

  • the first STEADY after READY
  • the first GO after STEADY

In this example, it would be 2023-06-30T19:34:00Z - 2023-06-30T19:31:30Z = 150 seconds

Obviously, a resource can transition in any timestamp and the duration can last from 30 seconds to 3600 seconds.
Once the resource is in GO status, it can transition back to READY (but I am not interested in measuring this duration).

What would be the most natural "Elastic" way of tackling this situation?
I was considering the following:

  • Elastic transform (Transforming data | Elasticsearch Guide [8.8] | Elastic). However I am having a hard time to figure out how to filter for the right two events which mark the boundary of the transition. Once I would have the events marking the transition boundary, then computing the duration is easy with transform grouping+aggregation.
  • Elastic ingest pipeline, perhaps with an enrich processor.
  • Elastic ML job (just an idea, I have not looked into this).
  • An external job scheduled by cron, for example, to post-process the data.
  • Something else?

Please let me know your thoughts.

Another variation on the same input data.
Sometimes it can happen that the transition observed is only READY -> STEADY but final GO is not seen for more than 1 hour. In that case, I'd need to assume that the final state is actually ERROR.

I indexed your docs into sunman

POST sunman/_bulk
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:29:00Z", "id": "resource-id-1", "status": "READY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:29:30Z", "id": "resource-id-1", "status": "READY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:30:00Z", "id": "resource-id-1", "status": "READY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:30:30Z", "id": "resource-id-1", "status": "READY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:31:00Z", "id": "resource-id-1", "status": "READY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:31:30Z", "id": "resource-id-1", "status": "STEADY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:32:00Z", "id": "resource-id-1", "status": "STEADY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:32:30Z", "id": "resource-id-1", "status": "STEADY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:33:00Z", "id": "resource-id-1", "status": "STEADY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:33:30Z", "id": "resource-id-1", "status": "STEADY" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:34:00Z", "id": "resource-id-1", "status": "GO" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:34:30Z", "id": "resource-id-1", "status": "GO" }
{ "index" : { "_index" : "sunman" } }
{ "timestamp": "2023-06-30T19:35:00Z", "id": "resource-id-1", "status": "GO" }

and ran the following query which seem to return accurate results. you be the judge.

GET /sunman/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "id.keyword": "resource-id-1"
          }
        },
        {
          "terms": {
            "status.keyword": ["STEADY", "GO"]
          }
        }
      ]
    }
  },
  "aggs": {
    "duration": {
      "scripted_metric": {
        "init_script": "state.times = [];",
        "map_script": """
          if (doc['status.keyword'].value == 'STEADY' || doc['status.keyword'].value == 'GO') {
            state.times.add(doc['timestamp'].value.getMillis());
          }
        """,
        "combine_script": "return state.times",
        "reduce_script": """
          def min_time = Long.MAX_VALUE;
          def max_time = Long.MIN_VALUE;
          for (state in states) {
            for (time in state) {
              min_time = Math.min(min_time, time);
              max_time = Math.max(max_time, time);
            }
          }
          return (max_time - min_time) / 1000.0;
        """
      }
    }
  }
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.