How do I use an array field as a variable in logstash?

I'm building out logstash and would like to build functionality to anonymize fields as specified in the message itself.

Given the message below, the field fta is an array of fields to anonymize. I would like to just use %{fta} and pass it through to the anonymize filter, but that doesn't seem to work.

{ "containsPII":"True", "fta":["f1","f2"], "f1":"test", "f2":"5551212" }

My config is as follows

 input {
   stdin { codec => json }
 }

 filter {
   if [containsPII] {
     anonymize {
       algorithm => "SHA1"
       key => "123456789"
       fields => [fta]
     }
   }
 }

 output {
   stdout {
     codec => rubydebug
   }
 }

The output is

{
    "containsPII" => "True",
            "fta" => [
        [0] "3c322875cb471164022de7ee7b8600ebc1c3845f",
        [1] "d9aeda754911ba13fe63f0da1964642856849a39"
    ],
             "f1" => "test",
             "f2" => "5551212",
       "@version" => "1",
     "@timestamp" => "2016-07-14T19:28:02.592Z",
           "host" => "..."
}

Does anyone have any thoughts? I have tried several permutations at this point with no luck.

Thanks,
-D

Sorry, I don't get what's wrong with the current output. Each array element has been hashed which is how the filter works. Do you want the array as a whole to be hashed and stored as a single string, or what's the desired results?

Sorry for the confusion. The desired result would be to let the FTA array dictate which other fields the anonymizer plugin would anonymize. Like below.

{
    "containsPII" => "True",
            "fta" => [
        [0] "f1",
        [1] "f2"
    ],
             "f1" => "3c322875cb471164022de7ee7b8600ebc1c3845f",
             "f2" => "d9aeda754911ba13fe63f0da1964642856849a39",
       "@version" => "1",
     "@timestamp" => "2016-07-14T19:28:02.592Z",
           "host" => "..."
}

The fta array specifies the target fields.

The use case is that the logging team is not aware of what fields should be anonymized, but the developers for each system are. The can easily send in an array of fields to anonymize which logstash can then act on.

Oh, I see now. Yeah, that's not possible. You'd have to use a ruby filter (but it'd just be a few lines of code for this).

That's what I was starting to think. Thanks for confirming. It's a shame I couldn't just do ["%{fta}"].

Do you have a link to a tutorial, or the objects available in a ruby filter? I've tried searching but wasn't able to find a good starting point anywhere on the internals of logstash at that point.

Thank you again,
-D

No, I don't have any good documentation pointers except the source code. But basically you read and write fields via event['fieldname'], enumerate fields by treating event as a hash, and the rest is normal Ruby.

Ok, thank you again.

It's a bit easier than I thought and I'm making good progress. My filter now looks like:

filter {
 if [containsPII] {
    ruby {
      code => "event['fta'].each {|item| event[item] = '1' }"
      add_tag => ["Rubyrun"]
    }
 }
}

Is there a way to call another filter from within the ruby filter? Specifically, I would like to leverage the work done in the anonymizer filter in the above. I'm relatively new to ruby, so I'm not even sure how to ask the question. perhaps something like:

filter {
 if [containsPII] {
    ruby {
      code => "event['fta'].each {|item| event[item] = Filters::Anonymize.filter(event[item],'123456789','sha1') }"
      add_tag => ["Rubyrun"]
    }
 }
}

Thanks

Well, I didn't find a way to use the existing anonymize function, which would be great if someone could show me how, but I did find a solution.

filter {
 if [fta] {
    ruby {
      init => "require 'openssl'"
      code => "event['fta'].each { |item| event[item] = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::SHA256.new, '123456789', event[item] ) }"
    }
 }
}

Thank you again, Magnus, for the help.
-Derek

ETA: I updated it with a better version to ensure uniform key. Also eliminated the containsPII field and just testing for the fta field.