Logstash refference condition


(fadi) #1

well let me tell you my case so i can simplify everything so I have 2 csv files reference_1.csv and reference_2.csv
plus the one I am indexing

I have 2 fields test1 and test2

test1 has {
"test1_ID": "123"
etc
}

test2 is
test2{
"test2_name": "name"
etc
}
the reference_2.csv has
test1_id test2_name Description code_name
12 name1 xxxx xxxx
12 name2 xxxx xxxx
13 name1 xxxx xxxx
13 name2 xxxx xxxx
I want to check if test1_id is available in reference_2 than I reference to it else I reference to reference_1

then I want to use test1_id and test2_name as keys to take code_name
so I get

test1{
"test1_ID": "123"
"code_name": "codename"
etc
}


#2

One way to do this is to transform reference_2.csv into something that a translate filter can use. Suppose we start off with

col1,col2,"Other","Stuff"
3,255,Lorem ipsum dolor sit amet,consectetur adipiscing elit
2,256,sed do eiusmod tempor incididunt,ut labore et dolore magna aliqua

and we run it through a configuration like this

input { stdin {} }
filter {
    csv { autodetect_column_names => true target => "object" }
    mutate { rename => { "[object][col2]" => "[key]" } }
}
output { stdout { codec => line { format => '"%{key}": %{object}' } } }

using

/usr/share/logstash/bin/logstash -f /path/to/file.conf --path.settings /etc/logstash < lookup.csv > dictionary.yml

that gets you a file that looks like this

"255": {"Stuff":"consectetur adipiscing elit","Other":"Lorem ipsum dolor sit amet","col1":"3"}
"256": {"Stuff":"ut labore et dolore magna aliqua","Other":"sed do eiusmod tempor incididunt","col1":"2"}

Note that we are not using numeric keys, we are converting them to strings.

If you then configure a translate filter to use that then it looks up the string "255" (since add_field always adds strings, not integers) and parses the JSON for you

    mutate { add_field => { "key" => 255 } }
    translate { dictionary_path => "/home/user/dictionary.yml" field => "key" destination => "[@metadata][dict]" }
    mutate { add_field => { "stuff" => "%{[@metadata][dict][Stuff]}" } }

results in

 "@metadata" => {
    "dict" => {
        "Other" => "Lorem ipsum dolor sit amet",
         "col1" => "3",
        "Stuff" => "consectetur adipiscing elit"
    }
},
       "key" => "255",
     "stuff" => "consectetur adipiscing elit"

There is another way to do this, which I am still thinking about, but this would work.


(fadi) #4

I got a little lost here, is there a way we can use hash of ruby for something simpler because both my test1_ID and test2_name are nested fields and like test1 feild has

test1 {

test_ID
test_before
test_after
}

I want to add a column from the csv file called code_name which has the 2 nested fields test1_ID and test2_name as a key so that the field becomes
test1 {
code_name
test_ID
test_before
test_after
}
and running logstash as a service cannot allow me to enter manual commands everytime


(fadi) #5

sorry for bothering you much but can you please help me with this problem


#6

If you cannot transform the data into something that a translate filter could use then you could implement all of the logic in ruby.


(fadi) #7

well I'm not good at ruby can you tell me how