Getting latest data per user_id in time series data without latest transforms?

MaterializedView · June 30, 2023, 4:06pm

I have a users index. Users have various status "New", "Waiting", "Completed". A status can go from "Completed" to "New" again. So in time series it would look something like

user_id, status, timestamp
1 NEW 2023-06-30 5:00
1 WAITING 2023-06-30 6:00
1 COMPLETED 2023-06-30 7:00
1 NEW 2023-06-30 8:00

2 NEW 2023-06-30 5:00
2 WAITING 2023-06-30 6:00
2 COMPLETED 2023-06-30 7:00

I have millions of users. I would like to get number of users who are correctly in NEW state. The result should be 1 but with pivot aggregations I get count = 2 because userId:1 was COMPLETED at some point but then back to "NEW".

Latest transforms are too expensive usage wise, so we want to run these queries on demand. Is collapse or pipeline aggregations possible? Could someone help me here.

Alexis_Roberson · June 30, 2023, 6:12pm

Can you ask your question in a different way? You mentioned you have an existing index, do you also have time series data? And are you saying the user index is being updated so that the status periodically movies from "Completed" to "New" value?

MaterializedView · June 30, 2023, 6:21pm

Yes I have a users_time_series index. The user status goes from NEW to WAITING to COMPLETED. In some cases it can go to NEW > WAITING > COMPLETED > NEW

Since its time series, we just append to the index. I want to get the count of users who are currently in COMPLETED status. I can do this via latest transforms but I was told, the usage isn't that high so we must do it without transforms.

system · July 28, 2023, 6:21pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Transforms - Latest and Pivot Kibana transforms	2	214	March 22, 2024
Transforms updates fields from data that exists from before even a filter condition was not met Elasticsearch transforms	4	471	December 23, 2020
Count only sum value of ID on their last date Kibana	7	150	October 5, 2023
Aggregation on `doc_count` of previous Aggregation Elasticsearch	2	383	February 14, 2017
Persisting latest state in separated index Elasticsearch	2	253	December 19, 2021

Getting latest data per user_id in time series data without latest transforms?

Related topics