Searching aggregation to calculate the status of running processes


#1

The simplified scenario is like this:

There are some processes, that are writing a new entry in Elastic every few minutes. This entry contains a status flag which shows if the process is finished or running. If the process is finished, no further entries are written. Every process has an unique id.

Now I need a visualization in Kibana, that can show the number of running processes. The data could look like this:

"_source": {
  "processId": "1",
  "status": "finished"
} 

There are four processes running

  1. Process with three entries:

    • open
    • open
    • finished
  2. Process with one entry:

    • open
  3. Process with two entries:

    • open
    • open
  4. Process with one entry:

    • finished

The number of finished processes are no problem, because "finished" is only written once per process. I can aggregate the processId with a unique count and use a filter for getting just the entries where status is finished. But how can I get the correct number of open processes?

The correct numbers in this example would be:

open: 2
finished: 2


(Jon Budzenski) #2

Is it possible to add a timestamp to your documents? If you do, you can use the top hits metric sorted on timestamp to pull the latest value of a bucket. The buckets would be a terms aggregation on process id.


(Stefan Hemmer) #3

Hi Jon,

thank you very much for your reply. I'll take the liberty to answer instead of @sasa0103 as we are working on the same problem. Yes it is possible to add a timestamp to the data definition. The use of the top hits metric brings us one step closer to what we are trying to achieve. There's just one crucial step missing that we can't figure out.

Using your proposal we can aggregate processes according to their last status, which would be the initial aggregation we need. Our requirement is however that we want to count the number of processes in each state, i.e. not see their last statuses in a list but to aggregate the resulting metrics to see the number of processes in each status.
Do you know how this can be achieved in Kibana/ElasticSearch? I am currently experimenting with bucket selectors aggregations and sum buckets in ElasticSearch, but even if i get it to work, I don't know how i could add multiple nested aggregations to a Kibana visualization like a simple metric.

Thanks in advance for your help!


(Jon Budzenski) #4

I see. I think you'll want to do this in the time series visual builder, which uses the last bucket by default for metrics.


and then on the Options tab set a filter for open. You can then clone this metric and add another one for finished.


(Stefan Hemmer) #5

Hi Jon,

thank you again very much for your help. Unfortunately I still don't understand how to solve my problem. A count on the time series grouped by process would still give me the wrong output. I'll get the number of messages in the second to last bucket in the state open, just grouped by processes. What I need is the total number of processes in a specific state.

In ElasticSearch I have written the following query which does exactly what I need:

POST foo/foo/_search
{
  "size": 0,
  "aggs": {
    "processes": {
      "terms": {
        "field": "processId.keyword",
        "size": 10
      },
      "aggs": {
        "latest_date": {
          "max": {
            "field": "timestamp"
          }
        },
        "filter_open": {
          "filter": {
            "term": {
              "status": "open"
            }
          },
          "aggs": {
            "latest_open_date": {
              "max": {
                "field": "timestamp"
              }
            }
          }
        },
        "should_be_considered": {
          "bucket_selector": {
            "buckets_path": {
              "latest_process_message": "latest_date",
              "latest_open_process_message": "filter_open>latest_open_date"
            },
            "script": "params.latest_process_message == params.latest_open_process_message"
          }
        },
        "count": {
          "cardinality": {
            "field": "processId.keyword"
          }
        }
      }
    },
    "sum_of_open_cases": {
      "sum_bucket": {
        "buckets_path": "processes>count"
      }
    }
  }
}

The result of this query delivers me the number of processes of which the last message is not of state "finished". Is it possible to implement a visualization of sum_of_open_cases in a kibana visualization?

Thank you again for helping me!


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.