Analyzing Transaction flows for orders in Kibana

Hello all,

We would like to create more insight into the processing of orders in our systems. The incoming orders go through several steps (processes). We would like to monitor all these steps using ELK. The goal is to create insights about the status of the orders: Have they been completed? If not, which step did it fail to complete?

As of yet we are not interested in troubleshooting errors. We only want to know which of the processes was the latest to "run".

  1. The order is received by the system. It has a unique orderID
  2. The order is processed in system 1
  3. The order is processed in system 2
  4. The order is sent out and marked as "processed"

For all 4 steps we managed to import the transaction data. Now we want to make the correct connection and analyze the data. We were thinking of creating a datatable such as this:

OrderID; Received; system1; system2; completed;

1234; Yes; Yes; Yes; Yes;
5678; No; Yes;

In this example, we would see that order 5678 was not completed. Currently we don't know how to create such a data table yet. Best we can do with the data table vizualisation is this:

Order ID; Filter_ServiceName;
1234; Received
1234; system1;
1234; system2;
1234; completed;
5678; Received;
5678; system1;
9102; Received;

This datatable would not be userfriendly, as it does not offer quick insight into which services are not processed (you can't sort on empty values for the "completed" process). Is there any way to create the first table?

As an alternative, it would also be good if we could create a query, where we could search for all orderID's that are present in the first step ("received"), but not present in the last step are "completed" yes. Is it possible to make such a query?

Note: every step in the process flow is written into ES as a seperate transaction. For each complete transaction, we will have 4 transactions in ES.

I would love to hear your ideas in how to solve this problem.

1 Like

Hi Martijn,

thanks for your very detailed explanation.

Also looking at your first table, I am a bit confused by the "No" in the Received for the second row. As far as I understood, a log always passes through the 4 steps in that order, so as long as system1 is already Yes, you could not get a "no" for any system before that?

If you just want to mark which systems each order passed, and you have a timestamp on the field, you could do the following with a data table:

  1. Do a bucket aggregation "Split Rows" on the "OrderID" field.
  2. Do a "Top Hit" metrics aggregation, on the Field "Filter_ServiceName", Aggregate with "Concatenate", "Size" set to 4 (since this is the max of your systems), Sort On "your timestamp field", Ascending

That way You would get a table, that lists each orderId per row, and shows all Filter_ServiceNames ordered by the "timestamp" field. So you would get the following for your above example data:

1234 | Received, system1, system2, completed
5678 | Received, system1
9102 | Received

If you don't have the time field to sort on, you could either add a "stage" field, that is 1, 2, 3 or 4 for each of the steps, so that you can sort on that, or go totally crazy and make the service names order alphabetical in the way a call passes that systems :wink:

Regarding the query: that will be a little bit harder and not doable in Kibana itself, since you would basically need to consecutive queries. First you would need to find all documents, that are in "received" state, and than check for each of those, whether or not they don't have a "completed" documents. Kibana cannot do this consecutive queries from the UI. But you could use the above created table and sort on the top hits field, which would sort depending on how many systems it passed.

Hope that explanation helped you a bit.

Cheers,
Tim

1 Like

Hi Tim,

Thank you for the suggestion. I've tested it and it works really well! We decided to add a variable that indicates the stage, and name it in such a way that it can be sorted alphabetically. For example: GetOrders -> 1_GetOrders

Since the "Top Hit" aggregation sorts the values inside the column on their timestamp, we will get the following string:

1234 | 4_finalized, 3_system2, 2_system1, 1_GetOrders

and for incomplete transactions:

5678 | 2_system1, 1_GetOrders

The nice thing is that we can sort the data table alphabetically on the "Top Hits" column. This way, the incomplete transactions will be displayed on the top of the data table:

OrderID | Top Hits
5678    | 2_system1, 1_GetOrders
1234    | 4_finalized, 3_system2, 2_system1, 1_GetOrders;

EDIT: Just realized that the data table already sorts on the amount of hits, so alphabetically sorting isn't even needed... :slight_smile:

Regards,

Martijn

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.