How Kibana Transforms work?

My question is Can we make Kibana Transforms do the same work as Rollup job does?
As per I analyzed, Transforms is grouping and doing aggregations on events and putting them in another index.
But I want to keep number of field and its events from one index to another just like rollup job. can it be done via Transforms?

Then keep using rollup
Tranform will only aggreggate, it will not keep any link to raw documents

Thanks for your Reply Yassine!
Actually we are facing some issue with the rollup job and got suggested to explore Transform for the same purpose if rollup doesn't work.
In rollup, the data gets indexed in rollup index but not the latest data. It stores Last 90 days data only.

Can you give an example with e.g. data and/or the rollup job? What are you missing in the transform?

Rollup and transform are not that different (some explanation below), both rollup and transform aggregate/reduce data, both write into a new index.

Can you elaborate, give a data example?
Maybe you just need to know which aggregation to use:

  • Do you want to keep a count, use the value_count aggregation.
  • Or do you want to collapse the events into the transform destination index, scripted_metric can do this.

Am I getting this right? Your rollup aggregation uses a fixed_interval of 90 days?

The difference between rollup and transform:

  • rollup only aggregates complete buckets, if you aggregate by 90 days, rollup waits until 90 days have passed. However, that's only a problem if you query the rollup index directly, you should instead use _rollup_search. Rollup search is a federated search that queries the rollup index and the source index and brings the results together (spoiler: this will become easier in future).
  • transform writes intermediate buckets, that means buckets that are not complete yet. If your fixed_interval is 90 days, it writes the last bucket with e.g. 42 days, if only 42 days have passed since the start of the bucket. With every run, transform will update/overwrite the bucket (43 days, 44, ...). Note, you need to configure it in continuous mode. Transform does not provide something like rollup search.

The question: "Can we make Kibana Transforms do the same work as Rollup job does?" can not be answered easily. In most cases the answer is yes, because transform is a more generic feature. However, transform does not provide rollup search.

I can answer this better once I understand your use case.


Thanks for this explanation Hendrik :slight_smile:
Coming to my Usecase, I have an index with set of, say, 50 fields and a very huge amount of data is streaming. Now I want to fetch around 10 fields and get a summarized data in a new index.
In this case how the rollup will exactly work? All the latest data should be coming in the new index right for all those 10 fields? and how can we do this in transform as well. It requires aggregation and grouping in Transform. I basically don't want to perform any aggregation. Also, how the data is indexed in rollup/transform. Only few number of events get indexed and after that nothing get indexed.
Can you plz help me understand the usage of Rollup in a better way?

ok, let me get this right.

Your stream of events contains data like:

    "field2": "something",
    "field50": "something_else"

in the new index you want:

    "field2": "something",
    "field10": "abcd"

You actually don't want to do something like sum(field1) to get e.g. the sum of all of those fields, right? That's what you mean, you don't want to aggregate?

If you only want the latest, there is a special transform function to do this called, well latest. :wink:

Latest will copy whatever is latest to the destination index, e.g. based on a timestamps. However it won't drop fields, but copy full documents. You can drop fields, if you really need to, in an ingest pipeline. However, I think we can ignore that for the 1st iteration.

If I still did not get it, please explain what you mean by "summarized data".

For me it seems like transform is the way to go, but I am not sure about the function that is right here, there is pivot - which is a group by - and latest. Have a look at the docs, e.g. this tutorial. The description for latest starts in point 7. It's also important you read about continuous vs. batch, continuous requires an extra step, see point 4 in the tutorial. There is more great docs around this tutorial, please check it out as well. The docs also explain what happens on the technical level.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.