I guess you mean cold node? Cold nodes should hold old/outdated data, warm nodes should have recent data.
The problem you describe is a conceptual one, it's not possible for transform to know, that you are only looking for the last document. I guess you are using a
scripted_metric with a similar implementation to what the docs provide? Transform treats the script aggregation as any other ordinary aggregation.
However, we are aware of this problem,
latest_doc/state is one of the top asks for transform. We are looking into possibilities to better support this use case.
But there is a workaround. After transform has created the 1st checkpoint (or even before, if you do not care about historic state), you can update the query and put in a range query with date math to filter out old data:
For this example we only allow 1 day old data. You can tweak this to your needs and align it with your setting for
frequency. The value should at least be
delay + frequency. Note: such a range query can be dangerous, if the lower bound is to low, transform skips over documents and produces wrong data.
To confirm the approach, you can run queries in dev console with the suggested range query manually. In the output you should see
skipped_shards, this tells you, if it worked. Shards are skipped in the can match phase of query execution, with other words: The coordinating node prunes the set of shards according to the range query, for pruned shards it won't forward the search request to e.g. a cold node that holds that shard.
Hope this helps.