How to get running time in this case?

There are two docs which are added by Application Logger (log4j).

docs have same instance id (key).

#1
{
"instance_id": "ee0e5890-4968-4dec-80c7-69e22ed4001b" ,
"message": "done",
"@TimeStamp" : "2019-08-06T09:33:05.046Z"
}

#2
{
"instance_id": "ee0e5890-4968-4dec-80c7-69e22ed4001b" ,
"message": "start",
"@TimeStamp" : 2019-08-06T09:00:00.581Z
}

I can't restructuring the data send to Elasticsearch like below.

{
"instance_id": "ee0e5890-4968-4dec-80c7-69e22ed4001b" ,
"start_time": "2019-08-06T09:33:05.046Z",
"end_time": "2019-08-06T09:00:00.581Z"
}

I want to calculate running time between #1 and #2, help me :frowning:

I think the new data frame transform functionality in Elasticsearch would provide a good solution for you. Using data frame transforms you can do a "group by" on a specific field (instance_id) and write the results of an aggregation on that grouped data to another index. In your case you could calculate the running time and write that to another index, which you then could use however you would like to use it.

Let's say you had indexed your data like this:

PUT my_index/_doc/1
{
  "instance_id": "ee0e5890-4968-4dec-80c7-69e22ed4001b",
  "message": "done",
  "@TimeStamp": "2019-08-06T09:33:05.046Z"
}

PUT my_index/_doc/2
{
  "instance_id": "ee0e5890-4968-4dec-80c7-69e22ed4001b",
  "message": "start",
  "@TimeStamp": "2019-08-06T09:00:00.581Z"
}

You can define a transform that aggregates this data with a scripted metric aggregation that calculates the running time, and writes the results to another index dest_index:

PUT _data_frame/transforms/transaction_transform
{
  "source": {
    "index": "my_index"
  },
  "dest": {
    "index": "dest_index"
  },
  "pivot": {
    "group_by": {
      "instance_id": {
        "terms": {
          "field": "instance_id.keyword"
        }
      }
    },
    "aggregations": {
      "timestamps": {
        "scripted_metric": {
          "init_script": "state.responses = ['start_time':0L,'end_time':0L]",
          "map_script": """
            def message = doc['message.keyword'].value;
            def timestamp = doc['@TimeStamp'].value;
            
            if (message.equals('start')) {
              state.responses.start_time = timestamp ;
            } else if (message.equals('done')) {
              state.responses.end_time = timestamp;
            }
""",
          "combine_script": "state.responses",
          "reduce_script": """
            def timestamps = ['start_time': 0L, 'end_time': 0L, 'running_time': 0L];
            for (responses in states) {
              timestamps.start_time = responses['start_time'];
              timestamps.end_time = responses['end_time'];
            }
            
            if (timestamps.start_time != 0L && timestamps.end_time != 0L) {
              timestamps.running_time = timestamps.end_time.toInstant().toEpochMilli() - timestamps.start_time.toInstant().toEpochMilli();
            }
            return timestamps;
"""
        }
      }
    }
  },
  "frequency": "5m",
  "sync": {
    "time": {
      "field": "@TimeStamp",
      "delay": "60s"
    }
  }
}

The resulting documents in the dest_index will now contain a timestamps.running_time field with a running time in ms:

          "instance_id" : "ee0e5890-4968-4dec-80c7-69e22ed4001b",
          "timestamps" : {
            "start_time" : "2019-08-06T09:00:00.581Z",
            "end_time" : "2019-08-06T09:33:05.046Z",
            "running_time" : 1984465

Thank you so much!, abdon!

this properties occur error... I'm using elasticsearch 7.2.0.

"frequeuncy": "5m",
"sync": {
"time": {
"field": "@TimeStamp",
"delay": "60s"
}
}

[error message]
{
"error": {
"root_cause": [
{
"type": "x_content_parse_exception",
"reason": "[27:3] [data_frame_transform_config] unknown field [frequeuncy], parser not found"
}
],
"type": "x_content_parse_exception",
"reason": "[27:3] [data_frame_transform_config] unknown field [frequeuncy], parser not found"
},
"status": 400
}

Looks like you misspelled frequency as frequeuncy.