I have multiple logs being sent out to Elasticsearch in the following way:
{
"resourceId" : 12345,
"event" : "SUBMITTED/PROCESSING/SUCCESSFUL"
"event_time" : "dateString"
}
Resource Id will be same across logs of the same resource.
What I want to calculate for my metrics is time taken between SUBMITTED -> SUCCESSFUL steps.
I am using Elastic Transform (since the calculation will be across different documents) for this use-case along with a scripted metric to calculate the time difference,
I am able to group the resources based on resourceId and do some calculations, but the data I am getting is incorrect, looks like I am not able to handle the data if for same resource the data is stored across different shards
my aggregation looks like below
{ "aggs": { "duration": { "scripted_metric": { "init_script": " state.start = 0; state.end = 0; state.duration = 0", "map_script": "if (doc['event.keyword'].value.equals(\"SUBMITTED\")) {state.start = doc.event_time.value.toInstant().toEpochMilli()} else if (doc['event.keyword'].value.equals(\"SUCCESS\")) { state.end = doc['event_time'].value.toInstant().toEpochMilli()} ", "combine_script": "if (state.start != 0 && state.end!= 0) {(state.duration = state.end - state.start); return state.duration;} else { state.duration = -1; return state.duration; }", "reduce_script": "double b = 0; for (a in states) { if (a != null) { b = a }} return b" } } } }
can someone help me on how this can be achieved?
Thanks
PS: I don't use Logstash, hence can't use elapsed()
filter , ideally don't want to use any filters