Searching aggregation to calculate the status of running processes

sasa0103 · February 28, 2018, 12:42pm

The simplified scenario is like this:

There are some processes, that are writing a new entry in Elastic every few minutes. This entry contains a status flag which shows if the process is finished or running. If the process is finished, no further entries are written. Every process has an unique id.

Now I need a visualization in Kibana, that can show the number of running processes. The data could look like this:

"_source": {
  "processId": "1",
  "status": "finished"
}

There are four processes running

Process with three entries:
- open
- open
- finished
Process with one entry:
- open
Process with two entries:
- open
- open
Process with one entry:
- finished

The number of finished processes are no problem, because "finished" is only written once per process. I can aggregate the processId with a unique count and use a filter for getting just the entries where status is finished. But how can I get the correct number of open processes?

The correct numbers in this example would be:

open: 2
finished: 2

jbudz · March 2, 2018, 1:13am

Is it possible to add a timestamp to your documents? If you do, you can use the top hits metric sorted on timestamp to pull the latest value of a bucket. The buckets would be a terms aggregation on process id.

shemmer · March 2, 2018, 2:50pm

Hi Jon,

thank you very much for your reply. I'll take the liberty to answer instead of @sasa0103 as we are working on the same problem. Yes it is possible to add a timestamp to the data definition. The use of the top hits metric brings us one step closer to what we are trying to achieve. There's just one crucial step missing that we can't figure out.

Using your proposal we can aggregate processes according to their last status, which would be the initial aggregation we need. Our requirement is however that we want to count the number of processes in each state, i.e. not see their last statuses in a list but to aggregate the resulting metrics to see the number of processes in each status.
Do you know how this can be achieved in Kibana/ElasticSearch? I am currently experimenting with bucket selectors aggregations and sum buckets in ElasticSearch, but even if i get it to work, I don't know how i could add multiple nested aggregations to a Kibana visualization like a simple metric.

Thanks in advance for your help!

jbudz · March 7, 2018, 10:30pm

I see. I think you'll want to do this in the time series visual builder, which uses the last bucket by default for metrics.

and then on the Options tab set a filter for open. You can then clone this metric and add another one for finished.

shemmer · March 8, 2018, 12:22pm

Hi Jon,

thank you again very much for your help. Unfortunately I still don't understand how to solve my problem. A count on the time series grouped by process would still give me the wrong output. I'll get the number of messages in the second to last bucket in the state open, just grouped by processes. What I need is the total number of processes in a specific state.

In ElasticSearch I have written the following query which does exactly what I need:

POST foo/foo/_search
{
  "size": 0,
  "aggs": {
    "processes": {
      "terms": {
        "field": "processId.keyword",
        "size": 10
      },
      "aggs": {
        "latest_date": {
          "max": {
            "field": "timestamp"
          }
        },
        "filter_open": {
          "filter": {
            "term": {
              "status": "open"
            }
          },
          "aggs": {
            "latest_open_date": {
              "max": {
                "field": "timestamp"
              }
            }
          }
        },
        "should_be_considered": {
          "bucket_selector": {
            "buckets_path": {
              "latest_process_message": "latest_date",
              "latest_open_process_message": "filter_open>latest_open_date"
            },
            "script": "params.latest_process_message == params.latest_open_process_message"
          }
        },
        "count": {
          "cardinality": {
            "field": "processId.keyword"
          }
        }
      }
    },
    "sum_of_open_cases": {
      "sum_bucket": {
        "buckets_path": "processes>count"
      }
    }
  }
}

The result of this query delivers me the number of processes of which the last message is not of state "finished". Is it possible to implement a visualization of sum_of_open_cases in a kibana visualization?

Thank you again for helping me!

system · April 5, 2018, 12:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Monitor running processes and their current state using a heat map Kibana	2	1447	June 7, 2018
Create visualization with actual status of process ("group by"?/subquery?) Kibana	4	532	February 4, 2021
Count of the number of a particular named process in Kibana dashboards Beats	1	235	March 2, 2020
Kibana Visualization Top Hit Counts Kibana	3	711	October 27, 2021
Kibana metricbeat default dashboard question about process count number? Kibana	3	1122	October 15, 2018

Searching aggregation to calculate the status of running processes

Related topics