Logstash add field from another input

(fadi) #1

i have a csv file with this field
"type" : "255~15674532~15724727~50195|256~41089839104~41221424128~131585024"
and another csv file with 3 fields value , name , function

first i used this ruby filter code to split this type field

ruby {
code => "
y = event.get('type').split('|').collect { |t|
c = t.split '~'
'type_ID' => c[0].to_i,
type_Before' => c[1].to_i,-
'type_After' => c[2].to_i,
'type_Change' => c[3].to_i,

    event.set('Type', y)


and i got nested fields as a result

okay so the type_ID is the same as value in the other csv file. so what i want is to return the 2 fields name and function of the type_ID so like i receive random type_ID and i want to index them with their name and function from the other csv file idk if there is a filter that can help or the ruby filter or translation


A translate filter would work provided you always have two types.

    translate {
        field => "[Type][0][type_ID]"
        destination => "[Type][0][type_Name]"
        dictionary_path => "/home/user/foo.csv"
    translate {
        field => "[Type][1][type_ID]"
        destination => "[Type][1][type_Name]"
        dictionary_path => "/home/user/foo.csv"

If it is a variable number then I would extract some of the code from the translate filter and implement it in ruby.

(fadi) #3

Yeah it is a variable number , i have a lot of Types so can you show me how can i implement it in a ruby filter like showing the type_name and type_function for each type

    ruby {
        init => '
            @dict = Hash.new
            CSV.foreach("/home/user/foo.csv", { encoding: "UTF-8", headers: true, header_converters: :symbol, converters: :all}) do |row|
                @dict[row[0]] = row[1]
        code => '
            t = event.get("Type")
            t.each_index { |i|
                t[i]["type_Name"] = @dict[t[i]["type_ID"]]
            event.set("Type", t)

will give you

      "Type" => [
    [0] {
        "type_Before" => 15674532,
        "type_Change" => 50195,
            "type_ID" => 255,
         "type_After" => 15724727,
          "type_Name" => "Foo"
    [1] {
        "type_Before" => 41089839104,
        "type_Change" => 131585024,
            "type_ID" => 256,
         "type_After" => 41221424128,
          "type_Name" => "Bar"

if the csv contains


Error handling is left as an exercise for the reader.

(fadi) #5

Thank you so much for your help and i wanted to ask you if we could add 2 @dict to take more than 1 row for example


I do not understand the question. The init loads every row of the CSV (except the first) into the hash.

(fadi) #7

Oh so if i want to load for example a second row i add

@dict[row[0]] = row[2]


No. CSV.foreach iterates over every row of the file. Each row of the csv is passed to the block as an array so

@dict[row[0]] = row[1]

Adds an entry to the has called @dict which has the key equal to column 1 and the value equal to column 2.

As I said, the entire file except for the first row gets added to the hash.

(fadi) #9

oh sorry my mistake I didn't mean row I meant column if I wanted to make the first column also key to the third column


No, you cannot. If you have a CSV that looks like this


then if you run something like

        CSV.foreach("/home/user/foo.csv", { encoding: "UTF-8", headers: true, header_converters: :symbol, converters: :all}) do |row|
            @dict[row[0]] = row[1]
            @dict[row[0]] = row[2]

The second entry overwrites the first, and you end up with a hash just containing

{255=>"Hello", 256=>"World"}

(fadi) #11

okay thank you and 1 last question. can I make 2keys for 1 value?


Yes, if you wanted the lookup to be the other ways around, from name to id, then you could do something like

        @dict[row[1]] = row[0]
        @dict[row[2]] = row[0]

(fadi) #13

okay thank you

(fadi) #14

if I want to check if this key exists in a column what do I use so it doesn't give me an exception

(fadi) #15

well let me tell you my case so i can simplify everything so I have 2 csv files reference_1.csv and reference_2.csv
plus the one I am indexing

I have 2 fields test1 and test2

test1 has {
"test1_ID": "123"

test2 is
"test2_name": "name"
the reference_2.csv has
test1_id test2_name Description code_name
12 name1 xxxx xxxx
12 name2 xxxx xxxx
13 name1 xxxx xxxx
13 name2 xxxx xxxx
I want to check if test1_id is available in reference_2 than I reference to it else I reference to reference_1

then I want to use test1_id and test2_name as keys to take code_name
so I get

"test1_ID": "123"
"code_name": "codename"

(fadi) #16

better open a new topic