I'm currently sending the Active Directory audit logs for our large network to my ELK stack (currently 4 VMs, but I have new hardware to deploy and an opportunity to do things better), which works fairly nicely, but boy, oh boy, those AD logs (and others) like to use a lot of space. I've had to aggressively tune and limit my Logstash and Elasticsearch configuration to cut things down to fit.
I'm principally concerned with security audit logs (they are AD servers after all), and so I'm looking at a filtered (particular event IDs) in the Security logs.
One of the fields, as emitted by winlogbeat and presented below, is 'message'. It is a re-presentation of the rest of the fields in the event.
{
"@timestamp" : ...,
"activity_id" : ...,
"beat" : ...,
"computer_name" : ...,
"event_data" : {
"AuthenticationPackageName" : ... ,
"ElevatedToken" : ... ,
"ImpersonationLevel" : ... ,
"IpAddress" : ... ,
"IpPort" : ... ,
"KeyLength" : ... ,
"LmPackageName" : ... ,
"LogonGuid" : ... ,
"LogonProcessName" : ... ,
"LogonType" : ... ,
"ProcessId" : ... ,
"ProcessName" : ... ,
"RestrictedAdminMode" : ... ,
"SubjectDomainName" : ... ,
"SubjectLogonId" : ... ,
"SubjectUserName" : ... ,
"SubjectUserSid" : ... ,
"TargetDomainName" : ... ,
"TargetLinkedLogonId" : ... ,
"TargetLogonId" : ... ,
"TargetOutboundDomainName" : ... ,
"TargetOutboundUserName" : ... ,
"TargetUserName" : ... ,
"TargetUserSid" : ... ,
"TransmittedServices" : ... ,
"VirtualAccount" : ... ,
"WorkstationName" : ...
},
"event_id" : 4624,
"keywords" : ["Audit Success"],
"level" : "Information",
"log_name" : "Security",
"message" : "An account was successfully logged on.\n\nSubject:\n\tSecurity ID:...
... a formatted version of the event_data field ...
... and then a (sometimes quite sizable) chunk of text explaining when you see
... this particular type of log.",
"opcode" : "Info",
"process_id" : 728,
"provider_guid" : "{54849625-5478-4994-A5BA-3E3B0328C30D}",
"record_number" : "58676792",
"source_name" : "Microsoft-Windows-Security-Auditing",
"task" : "Logon",
"thread_id" : 10764,
"type" : "wineventlog",
"version" : 2
}
I'd like to just drop the message field, but I'm loath to because the first line is very useful to have -- it tells you what this sort of event is. Actually, the last line can be kinda useful too, as it tells you when you might get this, but its really the first line I want because its nice to have in Kibana.
I currently have the following in logstash, which truncates the 'message' field to just be the first line. I should mention that at at this point we are using nxlog to send the logs to logstash, but I'm looking to replace nxlog with winlogbeat&filebeat, so the field names are somewhat different.
mutate
{
gsub => [ "Message", "^(?m)([^\r]*).*", "\1 [...]" ]
}
Actually, we do some other normalisation tasks in logstash too (such as stripping off ::ffff:
prefixing IP addresses, and removing some fields.
This works nicely, and helps me to keep more data inside ES, and gives me a significant speed boost too.
So my question is: would it be reasonable to put in a feature request for the ability to conditionally alter a field, such as the 'message' field for security related logs, perhaps by PCRE regex replacement or a range of lines (which it would have to break apart)?
Thanks for reading,
Cameron