Calculations between 2 Kibana hits on the @timestamp parameter

Hello everyone,

Here is the context :

In Kibana, I have 2 hits, let's say A and B. The hit B is an "answer" of the hit A and will then be received after the creation of hit A (after a few seconds up to a few days). Hit A and B have the same id.

I would like to do some calculations on on the time between hit A and hit B: average time of reception of hit B, median time, max time, min time etc. knowing that I only have the @timestamp parameter that define their creation time.

And I would like to be able to visualize this information on a time scale to understand for example: the percentage of hit Bs that were received 1 hour after their hit A, 4 hours after, 12 hours after etc...

I've been using Kibana for some time but I haven't been able to do this yet, already because I can't apply an average type of aggregation on @timestamp.

I hope it will be possible to do this on Kibana, don't hesitate to ask questions if some points seem blurred.

Thank you very much in advance for your help.

Best regards,

Baptiste

You need to use the Elasticsearch transformations feature to do this. https://www.elastic.co/guide/en/elasticsearch/reference/current/transforms.html

Hello @wylie,

Thank you for your answer.

I've tried to use the Create Transform functionality int the Machine Learning module with the help of the documentation but I didn't manage to do what I explained above.

First, how can I calculate the difference of two timestamp for a given id?

Then, for this new timestamp which will correspond to the time it took hit A to trigger hit B, is it really possible to make aggregations on it such as the mean or the median ? because on my interface I see only 3 aggregations available:
timestamp

Thank you very much in advance for your help.

Best regards,

Baptiste

You will need to use the JSON editor instead of the UI to do this. Here is an example of calculating a time diff using a script

Hello @wylie

Thanks for your help. I've managed to create my transform via the API.

I've now an issue with the start of this transform :

When I use the command via the API this is the error I get :

However, When I start it via the UI, I'm able to start the transform and it seems to work but the problem is that afterwards, I can't see my new index in Discover.

Could it be linked with authorization problems with index creation for my user account ?

Thanks in advance.

Baptiste

The endpoint is _transform/<transform_id>/_stop or if you use <=7.4: _data_frame/transforms/<transform_id>/_stop, your screenshot has at least 2 errors. However there is no difference in starting the transform like this or using the UI.

Most likely you miss the index pattern, can you check if the index exists using dev console? If so, please create a kibana index pattern for the dest index. It's also possible that you only miss the timestamp filtern in the index pattern. This is a known issue in older versions of the wizard.

If the above does not help: Please check the job messages in the UI, it should contain a possible error, it might also be useful to check the output of GET _transform/<transform_id>/_stats (<=7.4: GET _data_frame/transforms/<transform_id>/_stats). If the transform failed, it contains the error.

If you created the transform using the UI and you saw a preview, you should not have an authorization problem. Transform uses the role information of the creating user, if preview works, transform itself should work, too.

If none of the suggestions lead to a resolution: Can you share the output of _stats(the UI shows them, too) and relevant parts of your transform?

Hello @Hendrik_Muhs,

Thanks very much for your answer.

I've managed to fix my problem and I can now see my new index in Discover.

However, I would like to know if it's possible to filter this new index in discover with the date button that is normally located to the left of the refresh button at the top right of the discover page.
Morevoer, when I use my new index for a visualisation, I can see the date button but when I change the date ranges I see no changes on the data.

Thanks in advance for your help.

Baptiste

This sounds like a kibana issue, transform indexes are no different to any other indexes. I still guess a problem in the index pattern. Maybe someone with more knowledge about kibana can answer this better than me.

Well, I saw that in the tutorial example available on the Kibana Transform Documenation, the date button is also missing:

(link of the tutorial: https://www.elastic.co/guide/en/elasticsearch/reference/7.4/ecommerce-transforms.html).

Here is my index pattern and the result in discover (seems to work fine):

Could you @ someone that could be able to answer this please ?

Thanks very much in advance !

Hi @Baptiste_Orsoni,

Probably, the time controls in discover are missing, because your index pattern doesn't have a time field defined.
Starting from 7.9.0, the transform wizard allows you to specify the time field for the destination index pattern, see https://github.com/elastic/kibana/pull/68842.
Alternatively (e.g. on older versions), you can still manually delete the index pattern in the stack management section and re-create it with a time field defined.

Let me know if that doesn't bring the time controls in discover back

2 Likes

Hi @roskamp,

Thanks for your help !

I'm using 7.4 so I did what you said in the second alternative. The problem is, when I had to define a time field, I had only the choice between @timestamp.max and @timestamp.min. I've chosen @timestamp.max but then when I check my index in discover, I indeed have the time controls back but I don't have data anymore. Here is the script that define my Transform:

{
"group_by": {  
      "smpp.serversmpp.data.mt_id": { "terms": { "field": "smpp.serversmpp.data.mt_id" } }
      },
      "aggregations": {
      "@timestamp.min": { "min": { "field": "@timestamp" }},
      "@timestamp.max": { "max": { "field": "@timestamp" }},
      "@timestamp.duration_ms": { 
        "bucket_script": {
          "buckets_path": {
            "min_time": "@timestamp.min.value",
            "max_time": "@timestamp.max.value"
          },
          "script": "(params.max_time - params.min_time)"}}
       }
  }

I don't see how to define the normal @timestamp because I don't need to group my data with it and and it's not defined as well in the aggregations.

Thanks in advance for your help !

Best regards,

Baptiste

Hi @Baptiste_Orsoni ,

Looking at your screenshot above, I can see dates in your timestamp.max field, so the data should show up in Discover. Have you tried to select a wider time range?, e.g. Last 1 year?

To your second question how to add a timestamp to the destination data:
Looking at your configuration, your destination index currently holds a couple aggregated stats for each mt_id. So if you want to add another timestamp (in addition to timestamp.max and timestamp.min), you need to decide, how your destination index should look like.
One option is to add a second group_by entry with a date_histogram, e.g.

"group_by": {
    "smpp.serversmpp.data.mt_id": {...},
    "my_timestamp": {
        "date_histogram": {
            "field": "@timestamp",
            "calendar_interval": "1h"
        }
    }
},

This would produce one entry per mt_id per hour, aggregating the data from that hour only, and also showing the field my_timestamp (or however you want to name it).
But again, it really depends on what you need in your destination index. If the above didn't help you yet, let me know more about your use case and the structure you would like to see in the destination index.

Best regards,
Robert

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.