Visualization of processes

fred-d · May 11, 2016, 8:55am

I'm trying to visualize data from process or workflow executions and I encountered a few challenges:
They have a duration (start and stop time).
They can have inputs and outputs.
They can have an explicit structure (process might start subprocesses which might start subprocesses and so on).
They can have an implicit structure (one process might create an output that is used by another process).

All data is extracted from an RDBS, except for the implicit structure which is not yet modelled. In my first approach I just flattened every process into one document which loses information.

Are there any examples of something like this (I couldn't find any) or do you have any suggestions?
Thanks

Joe_Fleming · May 11, 2016, 8:18pm

Nothing comes to mind personally, but perhaps knowing what it is that you're trying to visualize about the processes might lead to some useful input.

fred-d · May 12, 2016, 7:43am

I'm still trying to figure that out since it's supposed to be generic and adaptable to client's needs. I was just hoping that someone had done something similar so I could copy some things...

So far I have visualizations for:
Count and duration of processes and how that changes over time
Number of processed objects
Success vs failure

Things I'd like to be able to do (maybe not that important):
Show the number of running processes at any time
Abstract from subprocesses or display them on the fly

Things I need to be able to do:
Apart from viewing processes individually, I also need to organize processes that work together and see if all input gets processed.

I'm thinking of duplicating the data and save all individual processes in one index and groups or hierarchies in another.

Joe_Fleming · May 13, 2016, 8:26pm

Show the number of running processes at any time

Should be able to query based on the start and stop times, where start is lte now and end is not set, assuming you create documents at start time and update them on completion. If you are not looking for the current jobs, but jobs in a time window, that's pretty easy too.

Things I need to be able to do:
Apart from viewing processes individually, I also need to organize processes that work together and see if all input gets processed.

I'm thinking of duplicating the data and save all individual processes in one index and groups or hierarchies in another.

That's the only way I can think to do it; just index each process, perhaps in a new index, and record the input/output and pass/fail, regardless of where it came from. I'm not sure how you'd keep that relational association though, of knowing if a process is a child, or knowing if a process has children.

fred-d · May 17, 2016, 7:18am

I can do that for currently running jobs, but how would that work in a date histogram?

Joe_Fleming · May 17, 2016, 4:47pm

If you're just looking for the number of jobs that ran during a given interval, which is how I understand your questions so correct me if I'm wrong, all you need is the count, and just do the aggregation on the start time. It may not be 100% accurate (long-running jobs that span multiple time chunks/buckets will only show up in 1 bucket), but it should be close enough that you'll get the insight you want. Mix that with a chart showing the run times and you should be get some decent insight into the system.

Raggyman · May 18, 2016, 4:19am

My monitoring system, use to do the following. It would split up a transaction based and display the breakdown of that transaction. Was extremely useful. It would be great to have a similar tool.

I have always hated average times. Because they don't show you the actual entire response time of the system. Kibana won't be able to do this, as the sheer number of points will make the browser unusable. Although it is extremely useful, as in this case, it shows the impact of backups on the system at 10pm.

fred-d · May 18, 2016, 6:51am

Actually, I'm interested in the number of parallel executed jobs, not the average over an intervall. The use-case I have in mind is the following: If there are parallel jobs and this coincides with longer runtimes, then these jobs are probably competing for resources. Then the jobs could be scheduled differently or on different systems. I think the best way to find that connection is to have two date-histograms in a dashboard showing max runtime and max parallelity.

fred-d · May 18, 2016, 7:00am

I usually use max or percentile metrics. Maybe point diagrams can be useful but I personally consider line charts more readabel.

Topic		Replies	Views
Searching aggregation to calculate the status of running processes Kibana	5	2063	April 5, 2018
How to visualize activity at a point in time, for time series data Kibana	3	402	November 20, 2018
Monitor running processes and their current state using a heat map Kibana	2	1526	June 7, 2018
Need to Create Visualization as in Discover? Kibana	3	369	November 26, 2018
Kibana: visualization for recurring jobs, advice needed Kibana	2	443	September 8, 2020

Visualization of processes

Related topics