Sanitizing Logs Before Shipping

Hi,

I currently have a situation where an application sometimes logs sensitive data to a file. I want to ship these logs using Filebeat but would prefer to sanitize the sensitive fields before they reach Elasticsearch. I can't modify the application. Is there any way to do this with Filebeat? If it isn't currently possible, will this feature be added any time soon? Thanks a lot.

This is currently not possible in Filebeat. For these use cases we recommend to add Logstash and add do the sanitizing of events in LS before forwarding it to ES.

Out of curiosity: What exactly do you mean by sanitize fields?

What exactly is a field? you have JSON or you thinking of some regular expression to much some sub-string in your log?

By sanitizing you want to: a) remove the field b) replace with custom string (e.g. ******* of same or constant length?) c) compute and replace with hash (or randomly generated value) such that events with same sensitive fields can still be correlated?

The output is currently JSON and I wanted to replace it with a custom string (*****). But the other option of replacing it with a hash look interesting as well.

Thanks a lot for your feedback.

If you already have JSON, you can use the drop_fields processor to remove fields. Otherwise you will have to use an Ingest Node pipeline or Logstash to redact contents (ensure you have TLS configured, such that events are encrypted when being send).

Thanks steffens. Is there any way to also delete a file after processing it using filebeat (in my case, a single transaction is stored per file)?

No. Filebeats main purpose is to tail log files, that is it relies on external tools doing log-rotation and deleting files.

Great. Thanks for your help steffens.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.