Merge data and metadata csv file in logstash

Hi,
I´m trying to import/combine two csv files as follows

logstash_metadata.csv:

Sample,Treatment,Code
S4444003,T_7896_D3,G10
S4444004,T_4516_D0t1h,G01

logstash_file.csv:

Sample,genus,value
S4444003,Chloronema,8
S4444003,Pseudaminobacter,3

when combining both files i use the Sample field.

input {
  file {
    path => "logstash_file.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    tags => ["16S"]
  }
}
filter {
  csv {
    skip_header => false
    columns => ["Sample","genus","value"]
    separator => ","

  }
  mutate {
    convert => {
      "value" => "integer"
    }
    }

  translate {
     target => "[@metadata][match]" 
     dictionary_path => "logstash_metadata.csv" 
     source => "Sample" 
           }
   dissect {
    mapping => { 
     "message" => "%{Sample},%{Treatment},%{Code}"
  }
   }

date {
    match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
    locale => en
    remove_field => ["timestamp"]
  }
}

output {
  elasticsearch {
    hosts => "http://localhost:9200"
    index => "16s_s4444"
    data_stream => false

  }
  stdout {codec => rubydebug {metadata => true}}
}

The output is as follows:


{
             "path" => "logstash_file.csv",
       "@timestamp" => 2022-01-31T09:30:05.998Z,
            "genus" => "Pseudaminobacter",
           "Sample" => "S4444003",
       "Code" => "3",
        "@metadata" => {
         "path" => "logstash_file.csv",
        "match" => "7896_D3",
         "host" => "localhost"
    },
         "@version" => "1",
             "host" => "localhost",
    "Treatment" => "Pseudaminobacter",
          "message" => "S4444003,Pseudaminobacter,3",
            "value" => 3,
             "tags" => [
        [0] "16S"
    ]
}

Both the Treatment and the Code are not assigned properly. Treatment should be T_7896 and Code should be G10.

I probably got something wrong, can you help ?

When a translate filter loads a csv it only uses the first two columns.

[Treatment] is incorrect because in your dissect filter you are mapping [message], when you probably meant to map [@metadata][match].

If you change the metadata csv to be

Sample,Treatment,Code
S4444003,T_7896_D3;G10
S4444004,T_4516_D0t1h;G01

using semi-colon as the separator for the second and third columns you could use

dissect { mapping => { "[@metadata][match]" => "%{Treatment};%{Code}" } }

This thread discusses other options.

Great !!! Works perfect.

However, and maybe this is expected , I have a strange warning, that i do not understand. See below.

[2022-01-31T21:06:13,737][WARN ][org.logstash.dissect.Dissector][main][05e203d1d6dfb3d7d35f2e0da7b0e487a5e172bd69729fe989b65ade05620b48] Dissector mapping, pattern not found {"field"=>"[@metadata][match]", "pattern"=>"%{Treatment};%{Code}", "event"=>{"@version"=>"1", "Sample"=>"Sample", "value"=>0, "@timestamp"=>2022-01-31T20:06:12.911Z, "host"=>"localhost", "path"=>"logstash_file.csv", "tags"=>["16S", "_dissectfailure"], "message"=>"Sample,genus,value", "genus"=>"genus"}}
{

This how my code looks now:


translate {
     target => "[@metadata][match]" 
     dictionary_path => "logstash_metadata.csv" 
     source => "Sample" 
           }
   dissect {
    mapping => {"[@metadata][match]" => "%{TreatmentTime};%{Donor.Code}"}}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.