Event dependent configuration broken with multiple pipelines?

Hi,

If I set pipeline.workers > 1 then event dependent configuration, "%{}", appears to be disabled. Here is my test case.

#!/bin/bash

# This works
export TIX_VERSION=1
gunzip --stdout tmp/cart.gz | logstash --pipeline.workers 1 --path.config=logstash/cart_logstash.conf

# This does not
export TIX_VERSION=2
gunzip --stdout tmp/cart.gz | logstash --pipeline.workers 4 --path.config=logstash/cart_logstash.conf

The single worker creates a series of indices, with the document_id's set with document_id. The parallel worker creates one index, with %{} in its name, with one document whose id is %{id} and lots of deleted documents.

I'm running logstash 6.2.4, which brew tells me is current.

My config file is as follows:

input {
    stdin {}
}
filter {
    mutate {
        gsub => ["message", "\"", "'"]
        # Double quote problem https://discuss.elastic.co/t/csv-filter-quote-character-causing-csvparsefailure/121273/3 '
    }
    csv {
        autodetect_column_names => "true"
        remove_field => ["message"] # message is the unparsed row
        separator => "	" # Yes, that is supposed to be a literal tab
        skip_empty_columns => "true"
        skip_header =>"true"
    }
    date {
      match => ["arrival_date", "YYYY-MM-dd HH:mm:ss"]
      target => "arrival_date"
      timezone => "UTC"
    }
    mutate {
     remove_field => ["host"]
    }
}
output {
      elasticsearch {
        document_id => "%{id}"
        hosts => ["${ES_HOST}:${ES_PORT}"]
        http_compression => "true"
        index => "cart-%{slug}-v${TIX_VERSION}"
        manage_template => "false"
        password => "${ES_PASS}"
        ssl => "true"
        user => "${ES_USER}"
    }
}  

Have I missed something, or should I open an issue?

If I set pipeline.workers > 1 then event dependent configuration, "%{}", appears to be disabled.

That sounds extremely unlikely, but I don't really have a better suggestion to offer. What happens if you replace your elasticsearch output with stdout { codec => rubydebug }? Do all resulting events contain the required id and slug fields?

stdout { codec => rubydebug } is giving me nonsense, like this:

{
"@version" => "1",
"locationA" => "locationA",
"@timestamp" => 2018-07-30T20:35:38.032Z,
"0.00" => "0.00",
"2016-03-19 22:00:00" => "2015-11-11 21:00:00",
"0000788c-87da-4219-9379-dd7f0febd1ac" => "0001c971-cc81-4cf1-8ae5-843958777a2d",
"8" => "3",
"Amanda.Example@notreal.com" => "Barry@example.com",
"Amanda Example" => "Barry Fake Person",
"Venue A" => "Venue A"
}

It perhaps has failed to autodetect_column_names?

Yes, that's most likely the case. I've never trusted the autodetect feature.

1 Like

There is already an issue for this in logstash-csv-filter.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.