You would use pipeline to pipeline communications with a forked path pattern.
In that pipeline that forks the data I would parse the field and generate the hashes. Something like
input { file { path => "/home/user/foo.txt" sincedb_path => "/dev/null" start_position => beginning } }
filter {
mutate { gsub => [ "message", "^{", "", "message", "}$", "" ] }
kv { field_split => "|" value_split => ":" trim_key => " " trim_value => " " remove_field => "message" }
fingerprint { source => "First name" target => "First name Hash" method => "SHA256" }
fingerprint { source => "Last name" target => "Last name Hash" method => "SHA256" }
fingerprint { source => "Company" target => "Company Hash" method => "SHA256" }
}
output { pipeline { send_to => ["pipe1", "pipe2"] } }
SHA256 creates long hashes, like "fd53ef835b15485572a6e82cf470dcb41fd218ae5751ab7531c956a2a6bcd3c7". You could use something shorter, for example generating a 32-bit checksum in a ruby filter, but that increases your risk of collisions.
Then in one pipeline, you replace the fields with the hashes
input { pipeline { address => "pipe1" } }
filter {
mutate {
rename => {
"First name Hash" => "First name"
"Last name Hash" => "Last name"
"Company Hash" => "Company"
}
}
}
and in the other, save the hashes and the data items they map
input { pipeline { address => "pipe2" } }
filter {
ruby {
code => '
event.set(event.get("First name Hash"), event.get("First name"))
event.set(event.get("Last name Hash"), event.get("Last name"))
event.set(event.get("Company Hash"), event.get("Company"))
'
}
mutate { remove_field => [ "First name Hash", "First name", "Last name Hash", "Last name",
"Company Hash", "Company", "Job", "Gender" ] }
}
For a line like
{First name:John |Last name:Doe | Gender:Male | Job:Manager | Company:ABC}
this will generate two events. One like
"fd53ef835b15485572a6e82cf470dcb41fd218ae5751ab7531c956a2a6bcd3c7" => "Doe",
"a8cfcd74832004951b4408cdb0a5dbcd8c7e52d43f7fe244bf720582e05241da" => "John",
"b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78" => "ABC",
and the other like
"Gender" => "Male",
"Job" => "Manager",
"Last name" => "fd53ef835b15485572a6e82cf470dcb41fd218ae5751ab7531c956a2a6bcd3c7",
"Company" => "b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78",
"First name" => "a8cfcd74832004951b4408cdb0a5dbcd8c7e52d43f7fe244bf720582e05241da",