Anonymization of sensitive data by Logstash

Hello,
can anyone help me how to anonymize sensitive data from multline log?
How to anonymize value of parameter sensitive data:

sensitive data:more than one line
some part on second
and on third line

should be replaced by

sensitive data: ANONYMIZED

A structure of a log:

# modify 1597833688 lorem ipsum lorem ipsum lorem ipsum
lorem: ipsum
lorem: ipsum
lorem: ipsum
lorem: ipsum
-
lorem: ipsum
lorem: ipsum
-
sensitive data:more than one line
some part on second
and on third line

lorem: ipsum

Is it possible to do with Logstash?

I've tried

  mutate {
    gsub => ["message", "sensitive data:\s*\S+", "sensitive data: ANONYMIZED"]
  }

replacing has been done only on line where sensitive data: was occured

I think you would have to use a ruby filter and sprintf it.

Thank you @Badger. Could you please post some simple example? I have no idea how to do using sprint. I found out something like that:

filter {
  ruby {
    code=>"       
      event['message'] = event.sprintf('%{message}')
    "
  }
}

Is there any documentation for sprintf function?

Sorry, I have no idea why I suggested that. It makes no sense to me today.

Is it always 3 lines?
Is there something to indicate when the sensitive data ends?

Hello @aaron-nimocks,
the end indicator is dash "-" .

-
sensitive data: more than one line
some part on second
and on third line
and other line
-

Are you ingesting that log as a single event? If you are you could use

mutate { gsub => [ "message", "(?m)sensitive data.*^-", "sensitive data: ANONYMIZED" ] }

If every line is standalone they you could use a ruby filter and maintain state in an @instance variable.

1 Like

@badger Nice solution. Thank you very much.
I've used this:

(?m)sensitive data:.*?^-", "sensitive data: ANONYMIZED"]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.