There are some processes, that are writing a new entry in Elastic every few minutes. This entry contains a status flag which shows if the process is finished or running. If the process is finished, no further entries are written. Every process has an unique id.
Now I need a visualization in Kibana, that can show the number of running processes. The data could look like this:
The number of finished processes are no problem, because "finished" is only written once per process. I can aggregate the processId with a unique count and use a filter for getting just the entries where status is finished. But how can I get the correct number of open processes?
Is it possible to add a timestamp to your documents? If you do, you can use the top hits metric sorted on timestamp to pull the latest value of a bucket. The buckets would be a terms aggregation on process id.
thank you very much for your reply. I'll take the liberty to answer instead of @sasa0103 as we are working on the same problem. Yes it is possible to add a timestamp to the data definition. The use of the top hits metric brings us one step closer to what we are trying to achieve. There's just one crucial step missing that we can't figure out.
Using your proposal we can aggregate processes according to their last status, which would be the initial aggregation we need. Our requirement is however that we want to count the number of processes in each state, i.e. not see their last statuses in a list but to aggregate the resulting metrics to see the number of processes in each status.
Do you know how this can be achieved in Kibana/ElasticSearch? I am currently experimenting with bucket selectors aggregations and sum buckets in ElasticSearch, but even if i get it to work, I don't know how i could add multiple nested aggregations to a Kibana visualization like a simple metric.
thank you again very much for your help. Unfortunately I still don't understand how to solve my problem. A count on the time series grouped by process would still give me the wrong output. I'll get the number of messages in the second to last bucket in the state open, just grouped by processes. What I need is the total number of processes in a specific state.
In ElasticSearch I have written the following query which does exactly what I need:
The result of this query delivers me the number of processes of which the last message is not of state "finished". Is it possible to implement a visualization of sum_of_open_cases in a kibana visualization?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.