Split a data stream into separate Indexes - to allow different ILM policies

I'm using the Custom UDP Logs integration and need to split the data stream into two separate indexes (data streams), so I can enable different ILM policies on the two different types of data coming in. Is that possible, or do I need to use two different integrations coming from the same source? Suggestions?

Hi @mgordon

Have you looked at the reroute processor... with a condition... made for just such a use case

First of all, the Custom UDP Logs and TCP Logs integrations do not do their best job into helping the end user setup the proper index template structure.

In the Fleet Integration we offer to customize the Dataset name, but the Integration will not do anything for you automatically.
It will install, by default, the Index Template logs-udp.generic regardless of the Dataset name we've chosen.

Note the Custom Logs integration is slightly different. If you set the Dataset name to my.dataset, it will automatically create for you an Index Template which will match logs-my.dataset so that you have a good starting point.

Now, back to the Custom UDP Log integration.

  1. Is the data the same structure/type, but it's just related to different environments?
  2. Can the data be received on different ports or hosts?

If the data is the same and you cannot receive it from different ports/hosts, then I would go with 1 integration and then route the events using an ingest pipeline.

This is based on 8.14-ish...

Approach A

Create a Custom UDP Integration with the Dataset name set to my.dataset.
I would recommend having an Index Template for a <type>-<dataset> index (e.g. logs-my.dataset Index template) completely decoupled with the udp.generic one.

To create the logs-my.dataset Index Template, you can clone the logs-udp.generic-* (and all the associated component templates)

  • Index Template logs-my.dataset (cloned from logs-udp.generic, with index pattern logs-my.dataset-*)
    • Component Templates:
      • logs@mappings
      • logs@settings
      • logs-my.dataset@package, cloned from logs-udp.generic@package, but:
        • replace the default_pipeline with logs-my.dataset - then create an empty ingest pipeline
        • define the type of the fields you will expect in the mappings section
      • logs-my.dataset@custom
      • ecs@mappings
      • .fleet_globals-1
      • .fleet_agent_id_verification-1

Once you have this, you can clone the Index Template logs-my.dataset into:

  • Index Template logs-my.dataset-namespace1 with index pattern logs-my.dataset-namespace1 and custom ILM Policy n1
  • Index Template logs-my.dataset-namespace2 with index pattern logs-my.dataset-namespace2 and custom ILM Policy n2

How do you route the events to different datastreams?
Using the logs-my.dataset ingest pipeline with the reroute processor. You can reroute events to a different namespace using a conditional.

This should somewhat ensure this will not break on future changes as here we're literally decoupling the behavior from the Custom UDP Logs assets.

Approach B

If instead you are ok with keeping udp.generic as Dataset name, then you can:

  • Clone the Index Template logs-udp.generic into 2 Index Templates
    • Index Template logs-udp.generic-namespace1 with index pattern logs-udp.generic-namespace1 and custom ILM Policy n1
    • Index Template logs-udp.generic-namespace2 with index pattern logs-udp.generic-namespace2 and custom ILM Policy n2
  • Define an ingest pipeline logs-udp.generic@custom with the reroute processor to route to namespace1 or namespace2

Approach A allows you to have your own "assets".

  • If you have to modify the Index Template, you have to do it 3 times.

Approach B makes you dependent on all the assets of upd.generic which can be an advantage/disadvantage.

  • If you have to modify the Index Template, you have to do it 3 times.
  • If the Index Template structure get changed by Fleet (enhancements, etc...), you might need to re-clone to align with the new Index Template of the Custom UDP
1 Like

I hadn't looked at that Reroute processor - it does exactly what I need! I tried to update the dataset or namespace manually in the pipeline, bit it failed becuase the value wasn't allowed in the constant-keyword datatype.

Thank you!

Thank you - I'm essentially using Approach B, but hadn't seen the Reroute processor, which was the missing piece.

Thank you!