String inteprolation in logstash data_stream fields

How can I write to different data streams for different kinesis input? We are trying to add fields in the inputs and use string interpolation in the outputs to no avail.

We had been using index+ilm_enabled in logstash, like this

In kinesis input:

        add_field => { _meta_index => "api-events" }

in elasticsearch output:

          index => "%{_meta_index}"
          ilm_enabled => "true"

we are trying to migrate to using data streams, like this:

in kinesis input:

        add_field => { _meta_data_stream_dataset => "api" }
        add_field => { _meta_data_stream_namespace => "events" }

in elasticsearch output we've tried both:

          data_stream  => "true"
          data_stream_type => "logs"
          data_stream_dataset => "events-%{_meta_data_stream_dataset}"
          data_stream_namespace => "%{_meta_data_stream_namespace}"

and

          data_stream  => "true"
          data_stream_type => "logs"
          data_stream_dataset => "events-%{[_meta_data_stream_dataset]}"
          data_stream_namespace => "%{[_meta_data_stream_namespace]}"

but neither works, instead writing to

.ds-logs-events-%{_meta_data_stream_dataset}-%{_meta_data_stream_namespace}-2022.10.26-000001

and

.ds-logs-events-%{[_meta_data_stream_dataset]}-%{[_meta_data_stream_namespace]}-2022.10.26-000001

I know my fields are being set because I can see example in a record:

I don't think this is supported, there was a similar question about it, and it seems that the data_stream settings will not sprintf the value.

I do not use data streams, but from the elasticsearch output documentation there is a setting named data_stream_auto_routing that may help you achieve what you want.

From what I understand, you will need to set this to true and create the following fields.

data_stream.type, data_stream.dataset and data_stream.namespace, then if those fields exist in the event, they will be used instead of the settings.

I understand the general vibe that there is an opinionated solution, I just don't know what it is.

Would an Elasticsearch team member be able to post a specific, working example? The docs have various references to dot notation and [bracket] notation, and it's unclear what the exact correct format is

None of these seem to be valid configurations

        add_field => { data_stream => { dataset => "events_api" } }
        add_field => { data_stream => { namespace => "events" } }
        add_field => { data_stream.dataset => "events_api" } }
        add_field => { data_stream.namespace => "events" } }
        add_field => { [data_stream][dataset] => "events_api" } }
        add_field => { [data_stream][namespace] => "events" } }

Doing both

    input {
      kinesis {
        ...
        add_field => { _meta_data_stream_dataset => "api" }
        add_field => { _meta_data_stream_namespace => "events" }
      }
  }

    filter {
      mutate {
        add_field => { "[data_stream][dataset]" => "%{_meta_data_stream_dataset}" }
        add_field => { "[data_stream][namespace]" => "%{_meta_data_stream_namespace}" }
      }
    }

works but is quite gross - is there a nice way to just set it in the input?

In Logstash when you want to work with a nested field, like data_stream.type, you need to refer to it as [data_stream][type], if you use data_stream.type in logstash, it is a reference to a field with a literal dot in its name.

In Elasticsearch and Kibana to refer to the same nested field you just use data_stream.type.

The [bracket][nested] is only used in Logstash, and this can be confusing some times.

So, to add a nested field you just use:

add_field => { "[top-level][nested]" => "value" }`

Just use the nested fields in the input.

input {
      kinesis {
        ...
        add_field => { "[data_stream][dataset]" => "api" }
        add_field => { "[data_stream][namespace]" => "events" }
        add_field => { "[data_stream][type]" => "logs" }
      }
}

From the documentation, you need to add the 3 fields or it will fallback to the configuration in the output.

If ilm_enabled is true, then the index option is overwritten, so that is not going to do what you want.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.