Forming a new document with all the fields everytime a new event comes in

Hello, So I am currently trying to figure this out and would love to know if my current approach is correct and if not, is there a way to do it. So, the question is
We have these events coming in with each event having minimum 3 states. Now, the first one is a complete event but the rest are just the necessary information needed. The task is when these event comes in, form the complete event from its past state and add it to a new index. for e.g.
'''
State 1:
{
state : "create",
startTime: "2020-12-13T21:34:16.978Z",
lastUpdateTime : "2020-12-13T21:34:16.978Z",
id : "01",
name : "xyz",
data : ["abc","c"]
}

State 2:
{
state : "Inprogress",
lastUpdateTime : "2020-12-13T21:34:16.978Z",
id : "01"
}

State 3:
{
state : "Completed",
lastUpdateTime : "2020-12-13T21:34:16.978Z",
id : "01"
}
'''
As you can see not all states are coming with all the fields so when we create this docx in the new index we have to not only add updated fields but also the field that were not present in that particular state.
Some of the things that I have already looked into are :
Collapse : Does not work as it just gives the latest document.
Tranform(Latest) : Does not work as it just replaces with the latest document.

Would love to know if pivot transform would work or is there any other way to do this.

I think what you describe is a case for pivot. I would group_by id and e.g. use max(lastUpdateTime). Regarding filling in sparse data: I think you need a scripted_metric to implement this. You could look for the fields and keep whatever every value that is not null.

It might get complicated if its possible to get conflicts e.g. 2 different names, in this case you need a tiebreaker, e.g. take the latest value.

Thank you for the reply. So, I started working with pivot and used group_by on id. After that I could not use max(lastUpdateTime) outside as there can be fields in the previous version of the docx that I will need so instead I did terms aggs and for every term I am ordering it based on its lastUpdateTime. But the problem is that now it is coming with that timestamp in the field and also it feels a bit complicated. Would love to know how would I be able to do a scripted_metric on this. Here is a snippet of what I wrote :

'''
POST _transform/_preview
{
"source": {
"index": "test3",
"query": {
"match_all": {}
}
},
"dest" : {
"index" : "test4"
},
"pivot": {
"group_by": {
"id": { "terms": { "field": "id" }}
},
"aggs": {
"state":
{
"terms" :
{
"field": "state",
"size": 1,
"order":
{
"last": "desc"
}
},
"aggs":{
"last":{
"max": {
"field": "lastUpdateTime"
}
}
}
},
"name":{
"terms":
{
"field": "name",
"size": 1
}}
}
},
"frequency": "5m",
"sync": {
"time": {
"field": "@timestamp",
"delay": "60s"
}
}
}
'''

@Hendrik_Muhs Also, if possible can you share an example of scripted metric that would fit this scenario. Thanks again.