What is the recommended approach to deal with logs to be as GDPR compliant as possible?
I need to pseudonymize user data. The only resource I found is dated back to march of 2018 in this blog post: https://www.elastic.co/de/blog/gdpr-personal-data-pseudonymization-part-1
Is this still best practice? Can I use the fingerprint method within a beat instead of logstash, because I'm not using logstash right now and try to keep my stack as slim as possible.
The ruby script in the logstash config file creates a new index 'identities' is there any similiar processor that can achieve the same? Can I recreate this file with specific config settings in my file/-metricbeat.yml
input {
tcp {
port => 5000
codec => json_lines
}
}
filter {
ruby {
code => "event.set('identities',[])"
}
# pseudonymise ip field
#fingerprint ip
fingerprint {
method => "SHA256"
source => ["ip"]
key => "${FINGERPRINT_KEY}"
}
#create sub document under identities field
mutate { add_field => { '[identities][0][key]' => "%{fingerprint}" '[identities][0][value]' => "%{ip}" } }
#overwrite ip field with fingerprint
mutate { replace => { "ip" => "%{fingerprint}" } }
# pseudonymise username field
#fingerprint username
fingerprint {
method => "SHA256"
source => ["username"]
key => "${FINGERPRINT_KEY}"
}
#create sub document under identities field
mutate { add_field => { '[identities][1][key]' => "%{fingerprint}" '[identities][1][value]' => "%{username}" } }
#overwrite username field with fingerprint
mutate { replace => { "username" => "%{fingerprint}" } }
#extract sub documents and yield a new document for each one into the LS pipeline. See https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html#_inline_ruby_code
ruby {
code => "event.get('identities').each { |p| e=LogStash::Event.new(p); e.tag('identities'); new_event_block.call(e); } "
}
#remove fields on original doc
mutate { remove_field => ["fingerprint","identities"] add_field => { "source" => "fingerprint_pipeline" } }
}
output {
if "identities" in [tags] {
#route identities to a new index
elasticsearch {
index => "identities"
#use the key as the id to minimise number of docs and to allow easy lookup
document_id => "%{[key]}"
hosts => ["elasticsearch:9200"]
#create action to avoid unnecessary deletions of existing identities
action => "create"
user => "elastic"
password => "${ELASTIC_PASSWORD}"
#don't log messages for identity docs which already exist
failure_type_logging_whitelist => ["version_conflict_engine_exception"]
}
} else {
#route events to a different index
elasticsearch {
index => "events"
hosts => ["elasticsearch:9200"]
user => "elastic"
password => "${ELASTIC_PASSWORD}"
}
}
}
Really looking forward to your answer and thanks a lot,
Nils