Add Winlogbeat option to truncate Security 'message' field to just first line

I'm currently sending the Active Directory audit logs for our large network to my ELK stack (currently 4 VMs, but I have new hardware to deploy and an opportunity to do things better), which works fairly nicely, but boy, oh boy, those AD logs (and others) like to use a lot of space. I've had to aggressively tune and limit my Logstash and Elasticsearch configuration to cut things down to fit.

I'm principally concerned with security audit logs (they are AD servers after all), and so I'm looking at a filtered (particular event IDs) in the Security logs.

One of the fields, as emitted by winlogbeat and presented below, is 'message'. It is a re-presentation of the rest of the fields in the event.

{
	"@timestamp" : ...,
	"activity_id" : ...,
	"beat" : ...,
	"computer_name" : ...,
	"event_data" : {
		"AuthenticationPackageName" : ... ,
		"ElevatedToken" : ... ,
		"ImpersonationLevel" : ... ,
		"IpAddress" : ... ,
		"IpPort" : ... ,
		"KeyLength" : ... ,
		"LmPackageName" : ... ,
		"LogonGuid" : ... ,
		"LogonProcessName" : ... ,
		"LogonType" : ... ,
		"ProcessId" : ... ,
		"ProcessName" : ... ,
		"RestrictedAdminMode" : ... ,
		"SubjectDomainName" : ... ,
		"SubjectLogonId" : ... ,
		"SubjectUserName" : ... ,
		"SubjectUserSid" : ... ,
		"TargetDomainName" : ... ,
		"TargetLinkedLogonId" : ... ,
		"TargetLogonId" : ... ,
		"TargetOutboundDomainName" : ... ,
		"TargetOutboundUserName" : ... ,
		"TargetUserName" : ... ,
		"TargetUserSid" : ... ,
		"TransmittedServices" : ... ,
		"VirtualAccount" : ... ,
		"WorkstationName" : ...
	},
	"event_id" : 4624,
	"keywords" : ["Audit Success"],
	"level" : "Information",
	"log_name" : "Security",
	"message" : "An account was successfully logged on.\n\nSubject:\n\tSecurity ID:...
	... a formatted version of the event_data field ...
	... and then a (sometimes quite sizable) chunk of text explaining when you see
	... this particular type of log.",
	"opcode" : "Info",
	"process_id" : 728,
	"provider_guid" : "{54849625-5478-4994-A5BA-3E3B0328C30D}",
	"record_number" : "58676792",
	"source_name" : "Microsoft-Windows-Security-Auditing",
	"task" : "Logon",
	"thread_id" : 10764,
	"type" : "wineventlog",
	"version" : 2
}

I'd like to just drop the message field, but I'm loath to because the first line is very useful to have -- it tells you what this sort of event is. Actually, the last line can be kinda useful too, as it tells you when you might get this, but its really the first line I want because its nice to have in Kibana.

I currently have the following in logstash, which truncates the 'message' field to just be the first line. I should mention that at at this point we are using nxlog to send the logs to logstash, but I'm looking to replace nxlog with winlogbeat&filebeat, so the field names are somewhat different.

mutate
{
    gsub => [ "Message", "^(?m)([^\r]*).*", "\1 [...]" ]
}

Actually, we do some other normalisation tasks in logstash too (such as stripping off ::ffff: prefixing IP addresses, and removing some fields.

This works nicely, and helps me to keep more data inside ES, and gives me a significant speed boost too.

So my question is: would it be reasonable to put in a feature request for the ability to conditionally alter a field, such as the 'message' field for security related logs, perhaps by PCRE regex replacement or a range of lines (which it would have to break apart)?

Thanks for reading,
Cameron

We are adding filtering features to Beats to limit the amount of data being shipped. For example, in v5 you can do filtering like:

filter:
  - drop_fields:
      fields: [event_data.Binary]

or

filter:
  - drop_fields:
      fields: [event_data.Binary, message]

And some conditional filters are coming in beta1.

We have not added any features quite like what you are suggesting because it can be done in Logstash and now in Elasticsearch with the Ingest Node feature. We don't want to duplicate features like grok or gsub to keep Beats simple. But the feature you are suggesting is for limiting the amount of data more than it is about groking or parsing. So I think you should open a feature request and then product manager and team can at least consider it.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.