Understanding transform stats

I am currently in the process of figuring out how to implement a transform on a quite large index.

Especially the monitoring part of it is a bit puzzling for me. For example, the search_time_in_ms on the transform currently shows 48167249 which seems quite slow.

However, there are no slow query warnings on the elasticsearch nodes that run the transform. Also, when i open the "preview" tab of the transform in elastic, the result is shown immediately, i can also pageinate through the preview quite quickly.

According to the docs, the value represents "The amount of time spent searching, in milliseconds.", but what does this mean exactly?

The mean/average time it spent searching over all past searches? Is it a single search query duration?

Hey @pulsy ,
What version are you on ?

Can you post here the full output of the GET transform statistics API ?

hi @greco - sure:

{
  "count" : 1,
  "transforms" : [
    {
      "id" : "prepare-events",
      "state" : "indexing",
      "stats" : {
        "pages_processed" : 71037,
        "documents_processed" : 41864978,
        "documents_indexed" : 41698701,
        "documents_deleted" : 0,
        "trigger_count" : 204,
        "index_time_in_ms" : 6813652,
        "index_total" : 35200,
        "index_failures" : 0,
        "search_time_in_ms" : 55956129,
        "search_total" : 71038,
        "search_failures" : 0,
        "processing_time_in_ms" : 649302,
        "processing_total" : 71037,
        "delete_time_in_ms" : 0,
        "exponential_avg_checkpoint_duration_ms" : 1038350.2659654395,
        "exponential_avg_documents_indexed" : 1320182.8309541699,
        "exponential_avg_documents_processed" : 1343300.0856498873
      },
      "checkpointing" : {
        "last" : {
          "checkpoint" : 4,
          "timestamp_millis" : 1694619620033,
          "time_upper_bound_millis" : 1694619560033
        },
        "next" : {
          "checkpoint" : 5,
          "position" : {
            "bucket_position" : {
              "owner" : "74ddb173-f6c6-486c-bda7-59bcf520f1ac",
              "messageId" : "20ed1061-47f9-11ee-8b38-557f4528ee56"
            }
          },
          "checkpoint_progress" : {
            "docs_indexed" : 34229377,
            "docs_processed" : 34233071
          },
          "timestamp_millis" : 1694625075312,
          "time_upper_bound_millis" : 1694625015312
        },
        "operations_behind" : 13747650,
        "changes_last_detected_at" : 1694696169593,
        "last_search_time" : 1694696169593
      }
    }
  ]
}

According to the docs, the value represents "The amount of time spent searching, in milliseconds.", but what does this mean exactly?

That's a good question :slightly_smiling_face:

search_time_in_ms is the sum of all search times, so it's a strictly growing quantity.
To obtain the average search time, you must divide it by search_total, which in your case gives:
55956129 / 71037 = 788 milliseconds in average to perform a search.


If you still think this is too slow (and especially if you observe that operations_behind tends to grow),
you can try and optimise your transform using the Profile Search API .

To do so, you need to rewrite your transform as a standalone search query :

This can be done by taking the relevant sections from the transform configuration and then by creating a search query (reuse the source query and aggregations where the top aggregation is based on the group_by of the transform and the aggregations part are sub-aggregations).

Hope that helps, good luck with profiling !

That clarifies a lot, thanks for your help!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.