Translate filter plugin

Dear colleagues

I am using many translate filter plugins to mutate data in logstash. The filter requires me to create a dictionary file (.yaml) that contains 2 columns. That way I need lots of .yaml files.
For me it would be much easier if there were either:

  • A filter plugin that creates These yaml dictionary files out of e.g. one .csv file
  • A dictionary file format that contains several columns. Here I would need to specify the column out of which the target value would be pulled. Similar to Excel "vlookup(...)"

Does either way work today or in the future?

Best Michael

1 Like

If you want to enrich with more complex data you can always format the string as JSON and then use a JSON filter to parse it:

input {
  generator {
    lines => ['test']
    count => 1
  } 
} 

filter{
  translate {
    field => "message"
    dictionary => {
      "test" => '{"a":1,"b":2}'
    }
  }

  json {
    source => "translation"
    remove_field => ["translation"]
  }
}

output { stdout { codec => rubydebug} }

That works OK with JSON, but it does not work very well with a simple multi-column CSV file. You only get the second column back as the translation.

I think it might be useful to have a lookup filter that would return a row of the CSV as a hash, but I'm struggling to come up with the right interface for it.

Yes, you are indeed right.

@Michael.Swiss

How would you specify the "VLOOKUP" column? I mean where would it come from? A field?

If so, then a Key, JSON string value can work.
You put the JSON object into the event with the translate and the JSON filter, as per Christians example but moving the remove field, and then use:

  mutate {
    rename => {"final_field" => "[translation][%{second_key}]"}
    remove_field => ["translation"]
  }

The Translate Filter accepts a CSV:

The currently supported formats are YAML, JSON, and CSV. Format selection is based on the file extension: json for JSON, yaml or yml for YAML, and csv for CSV. The CSV format expects exactly two columns, with the first serving as the original text (lookup key), and the second column as the translation.

-- Logstash Translate Filter: dictionary_path

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.