How to calculate the elapsed time of a flag status per day?

MaximeG · July 23, 2021, 2:37pm

I'd like to figure out the elapsed time between flag status changes.

Simplified example : a person can only be sad or happy. I'd like to know how long each mood was active until it changed.

I'm looking for a general best practice . I'm open to change the way I create my logs. I'm new to data science and probably miss the right terms. Related advices are very appreciated.

Index: user_mood_log

 |    changed_at    |  mood  | user |
 ------------------------------
 | yyyy-mm-dd 12:00 | sad    | John |
 | yyyy-mm-dd 15:00 | happy  | John |
 | yyyy-mm-dd 18:00 | happy  | John |
 |        ...       |   ...  |  ... |

Challenges:

The last record of yesterday (or even older) needs to be considered. Means I can't simply group by day.
A mood might not change for several days.
Duplicated mood logs are possible.
I could want to change the grouping by hours or months or others on the fly
It could have more than 2 moods
Number of logs are somewhat big

Any advice? Thanks in advance!

MaximeG · July 23, 2021, 4:13pm

I've just found this Transactions | Kibana Guide [7.13] | Elastic

which is exactly the kind of graph I need to do but based on filebeat data.

"transaction" id would be John and I would see time passed in each status.

How can I do that?

spinscale · July 26, 2021, 7:51am

A transaction is part of APM (application performance management), where this transaction gets created before it is indexed.

However, in your case, you might be able to keep your data structure and index events as they come in, but then use transforms to move the data into a format suitable for querying further.

MaximeG · July 26, 2021, 12:24pm

Thanks for the answer, I've looked into transforms as well but I can't find a way to calculate elapsed time between two documents when there's no start/stop events but a continuous flow of a wide range of events.

Any ideas on how to do that? Is it only possible?

spinscale · July 26, 2021, 12:25pm

then it would not be a state change but just the duration between two events that are tied to the same user that need a duration?

MaximeG · July 26, 2021, 12:29pm

I might be a little confused on the wording here but I guess yes.

"State" for me is happy/sad/other/etc.. I need the total duration per day or per other aggregations and for each users.

Hendrik_Muhs · July 28, 2021, 8:22am

Did you check this example from the docs?

Otherwise, if the start and stop are explicit named events, you can use a scripted_metric. Another possibility is using filter to write 2 fields and an ingest pipeline on the destination index that calculates the duration from it.

MaximeG · July 28, 2021, 1:51pm

I've checked the example, my problem is that I don't know how to bucket only one start and one stop event. I don't want the duration of the min start (sad) and stop (happy) event of john, I want all durations between each events.

I've looked at ingest pipeline that could add the time of previous state in the log but I think it's duplicating information that will be less stable and less maintainable so I try to avoid doing that.

For now, direction I'm looking at is something like Entity-Centric Indexing - Mark Harwood | Elastic Videos

But it's awfully complicated for the newbie I am.

Hendrik_Muhs · July 28, 2021, 2:53pm

Ok, got it. In this case you probably need a script, I am not aware of an out of the box function that can do this.

Before further recommending how to script it: Are you working with a static dataset, which means it does not change? Or is this a data stream? (If not now, maybe in future?)

What's the output you are looking for exactly?
Do you want 1 document per change with a duration?
Or e.g. the sum over these durations.

Maybe you can provide 1 or 2 sample documents for the output you are looking for.

MaximeG · July 28, 2021, 3:38pm

This is a future data stream that is going to grow up rapidly. Analytics in future will be more useful the more real-time we get. I hope we can output some actionable data a minute after events gets logged.

example of data needed:

I want to see if the total duration John is sad per day is growing or going down.
I want to get which states are the most common in a group of users of a particular user.
I want the average duration of the state "sad". And if it's going down or up over time.

Best widgets to display that kind of data are in APM but I would need to heavily manipulate data to fit those. APM is not design for that.

That's why I hoped a continuous transform (or a continuous batching python script) with standard visualizations would do the trick. I'm just lacking skills or close examples that would help me.

I'm shocked that there's no out of the box solution for that kind of analytics.

MaximeG · July 28, 2021, 6:49pm

I just realize that this could be done if we can "upsample" the data.

Let's imagine the same example:
| changed_at | mood | user |

Which I upsample to the minute:

| changed_at | mood | user |

Then I could aggregate that data with a count of records and I would have a number on minute in that state.

The aggregation would be pretty simple, is there a way to upsample data like pandas.resample would do in a transform?

Thank you

Hendrik_Muhs · July 30, 2021, 6:35am

I am not aware of up-sampling in elasticsearch. What strikes me in this use case: It is possible to calculate this with a script, however every new data point would cause a full recalculation, there fore I think this won't scale. It would help a lot, if you create daily buckets, but if the 1st datapoint of a day looks like:

| yyyy-mm-dd 00:24 | sad | John |

it's not possible to know if "John" was "sad" or "happy" for the first 24 minutes of the day. For the right side of the window you could assume the last state persists for the rest of the day. If you get an update you would recalculate anyway.

It's not possible to access the value of the last time bucket. The only option I see is custom code.

system · August 27, 2021, 6:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to duration between two status changes? Kibana	2	30	September 23, 2024
Calculating time between two events from same request id Kibana	4	1019	April 17, 2024
How to detect status transition in ingested log data Elasticsearch elastic-stack-monitoring	3	242	July 30, 2023
How to get running time in this case? Elasticsearch	5	540	October 9, 2019
Can I use Elasticsearch to save time-based events and get the intervals between them? Elasticsearch	2	432	June 10, 2020

How to calculate the elapsed time of a flag status per day?

Let's imagine the same example: | changed_at | mood | user |

| changed_at | mood | user |

Related topics

Let's imagine the same example:
| changed_at | mood | user |