Compare a field with a csv file and add info from csv in a new field

Hello,

I have a field named id in a grok filter, which I want to compare with a CSV file. For example :
Index ELK :

id=12

csv file :

11,home
12,garden
13,office 
...

How can I compare the id field with the first field of my csv, and if it matches, add a new field with the corresponding string like :

id=12
location=garden

I'm able to use translate to compare with just one information by line in a csv file and add a boolean value if it's matching :

csv file :

11
12
13

conf :

translate {
field => "[id]"
destination => "[location]"
dictionary_path => '/home/file.csv'
refresh_interval => '1000'
}

result

id=12
location=True

But I don't see how to add a field with the corresponding value. Thank you.

If file.csv contains

11,home
12,garden
13,office

then this configuration

input { generator { count => 1 lines => [ '' ] } }

filter {
    mutate { add_field => { "id" => "12" } }
    translate {
        field => "[id]"
        destination => "[location]"
        dictionary_path => '/home/file.csv'
        refresh_interval => '1000'
    }
}
output { stdout { codec => rubydebug { metadata => false } } }

will produce

  "location" => "garden",
        "id" => "12",

Your filter should work as is.

1 Like

Do I have to add all the number of id in the mutate field if I want to catch them all ? Because there are a lot of id ...

mutate { add_field => { "id" => "11" } 
add_field => { "id" => "12" }
add_field => { "id" => "13" }
}

etc ... ?

No, you are missing my point. I was trying to show that if the event has a field id which contains "12" then that translate filter will set [location] to "garden".

If the filter is not doing that it suggests the event does not have an [id] field.

So what's the correct way to do that overall ? Sorry for the delay

The filter you had in your initial post is the right way to do it. If it does not do the translation the only explanation I can think of is that the event does not have an [id] field.

Thank you Badger,

I found the problem.

The field id is an array object :
In my grok filter :

u'ids': \[(?<[@metadata][ids]>[^\]]+)\]

In my ruby filter :

ids = event.get('[@metadata][ids]')
		if ids
			id = ids.scan(/{u'type': u'([^']+)', u'id': ([0-9]+)}/)
			event.set('id', id)
		end

So in Kibana, it's visible like this :

["firstid", "1"]
["secondid", "2"]

Logstash don't like this and loop restart. I'm unable to select which field I want in id.
Edit : I try with another field which is not nested ( %{GREEDYDATA:dst-ip} ) , but it doesn't work :frowning:

translate {
		field => "[dst-ip]"
		destination => "[malicious]"
		dictionary_path => '/home/ip.csv'
		refresh_interval => '1000'
	}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.