Logstash config for nested log

Hello everyone...
I have multiple event logs from windows. Converted them to csv using EvtxEcmd.exe.
I combined lots of event logs into one csv file. I need to upload it to ELK. below are the column names-

RecordNumber,EventRecordId,TimeCreated,EventId,Level,Provider,Channel,ProcessId,ThreadId,Computer,ChunkNumber,UserId,MapDescription,UserName,RemoteHost,PayloadData1,PayloadData2,PayloadData3,PayloadData4,PayloadData5,PayloadData6,ExecutableInfo,HiddenRecord,SourceFile,Keywords,ExtraDataOffset,Payload

Below is a sample log line (I picked which could be possibly longest one).

3308,530390,2021-07-01 09:20:40.6757225,4016,Info,Microsoft-Windows-GroupPolicy,Microsoft-Windows-GroupPolicy/Operational,10840,1104,DESKTOP-2C7Q9UJ.mydomain.com,42,S-1-5-18,,,,,,,,,,,False,z:\desktop-Event_Logs\DESKTOP-2C7Q9UJ-Microsoft-Windows-GroupPolicy-Operational.evtx,0x4000000000000000,0,"{""EventData"":{""Data"":[{""@Name"":""CSEExtensionId"",""#text"":""827d319e-6eac-11d2-a4ea-00c04f79f83a""},{""@Name"":""CSEExtensionName"",""#text"":""Security""},{""@Name"":""IsExtensionAsyncProcessing"",""#text"":""True""},{""@Name"":""IsGPOListChanged"",""#text"":""False""},{""@Name"":""GPOListStatusString"",""#text"":""%%4101""},{""@Name"":""DescriptionString"",""#text"":""Default Domain Policy, Deny Logon locally for service account, ""},{""@Name"":""ApplicableGPOList"",""#text"":""<GPO ID=""{31B2F340-016D-11D2-945F-00C04FB984F9}""><Name>Default Domain Policy</Name></GPO><GPO ID=""{534B3EFE-DA09-40FA-B138-C526C40F272D}""><Name>Deny Logon locally for service account</Name></GPO>""}]}}"

The last column "Payload" contains too much nested information in json, then list then json (starts with "{""EventData""). The json fields with @name and #test may vary depending on the log.

Can someone guide me how to create config file for this? Plz explain the mappings clearly because I may need to go other route also using different format.

Another method, I worked abit with SED and converted the logline to below using ubuntu terminal. Removed the nested json mappings and renamed @name to name1, name2, name3 contd.. and #text to text1, text2, text3 contd..

3308,530390,2021-07-01 09:20:40.6757225,4016,Info,Microsoft-Windows-GroupPolicy,Microsoft-Windows-GroupPolicy/Operational,10840,1104,DESKTOP-22222.myDomain.com,42,S-1-5-18,,,,,,,,,,,False,z:\desktop-Event_Logs\DESKTOP-22222-Microsoft-Windows-GroupPolicy-Operational.evtx,0x4000000000000000,0,"Data":{"name1":"CSEExtensionId","text1":"827d319e-6eac-11d2-a4ea-00c04f79f83a","name2":"CSEExtensionName","text2":"Security","name3":"IsExtensionAsyncProcessing","text3":"True","name4":"IsGPOListChanged","text4":"False","name5":"GPOListStatusString","text5":"%%4101","name6":"DescriptionString","text6":"Default Domain Policy, Deny Logon locally for service account, ","Name7":"ApplicableGPOList","text7":"<GPO ID="{33322340-016D-11D2-945F-00C04FB984F9}"><Name>Default Domain Policy</Name></GPO><GPO ID="{534B3EFE-DA09-40FA-B138-C526C40F272D}"><Name>Deny Logon locally for service account</Name></GPO>"}

Here I converted whole nested jsons into single json with the key "Data" (plz correct me if my understanding of creating a json is wrong here, but I've tried all kind of combinations). When I upload this file to ELK, i get csvparsefailure.
Plz point out where am I wrong.

Your Payload field contains commas, so it needs to be quoted, and that means all the double quotes within it need to be escaped using a second double quote...

"""Data"": {""name1""

etc. I think you would be better off using a csv filter to parse the unmodified output of Event Explorer and then modify the data structure in logstash.

I did modify the logline and csvparsefailure is gone. The logline-

3308,530390,2021-07-01 09:20:40.6757225,4016,Info,Microsoft-Windows-GroupPolicy,Microsoft-Windows-GroupPolicy/Operational,10840,1104,DESKTOP-22222.mydomain.com,42,S-1-5-18,,,,,,,,,,,False,z:\desktop-Event_Logs\DESKTOP-22222-Microsoft-Windows-GroupPolicy-Operational.evtx,0x4000000000000000,0,"""Data"":{""@Name"":""CSEExtensionId"",""#text"":""827d319e-6eac-11d2-a4ea-00c04f79f83a"",""@Name"":""CSEExtensionName"",""#text"":""Security"",""@Name"":""IsExtensionAsyncProcessing"",""#text"":""True"",""@Name"":""IsGPOListChanged"",""#text"":""False"",""@Name"":""GPOListStatusString"",""#text"":""%%4101"",""@Name"":""DescriptionString"",""#text"":""Default Domain Policy, Deny Logon locally for service account, "",""@Name"":""ApplicableGPOList"",""#text"":""<GPO ID=""{31B2F340-016D-11D2-945F-00C04FB984F9}""><Name>Default Domain Policy</Name></GPO><GPO ID=""{534B3EFE-DA09-40FA-B138-C526C40F272D}""><Name>Deny Logon locally for service account</Name></GPO>""}"

But do you think the log should be like this on kibana?

"""Data"":{""@Name"":""CSEExtensionId"",""#text"":""827d ...--trimmed--...

The config file is-

input {
file {
path => "/home/kriss/triage/security-duplicate.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
} }
filter {
csv {
separator => ","
columns => ["RecordNumber","EventRecordId","TimeCreated","EventId","Level","Provider","Channel","ProcessId","ThreadId","Computer","ChunkNumber","UserId","MapDescription","UserName","RemoteHost","PayloadData1","PayloadData2","PayloadData3","PayloadData4","PayloadData5","PayloadData6","ExecutableInfo","HiddenRecord","SourceFile","Keywords","ExtraDataOffset","Payload" ]
}
# json {
# source => "Payload"
# }
}
output {
elasticsearch { hosts => "localhost" index=> "eventlogs" }
stdout { }
}

Would be great if you could provide abit detailed view. I am new to ELK. Thanks

You are modifying the JSON to be even less valid than it starts off being. Looking at the original output from Event Explorer, it contains

"#text":"<GPO ID="{31B2F340-016D-11D2-945F-00C04FB984F9}"><Name>Default...

The double quote following ID= needs to be escaped. I would fix that by removing it.

    mutate {
        gsub => [
            "Payload", '"{', "{",
            "Payload", '}"', "}"
        ]
    }

However, it is entirely possible that that will break things for other messages. Since the data you start with is not valid JSON it is going to be hard to get around that for the general case.

Once that is done

    json { source => "Payload" target => "[@metadata][payload]" }
    ruby {
        code => '
            data = event.get("[@metadata][payload][EventData][Data]")
            if data.is_a? Array
                h = {}
                data.each { |x|
                    h[x["@Name"]] = x["#text"]
                }
                event.set("someField", h)
            end
        '
    }

will get you

      "someField" => {
              "IsGPOListChanged" => "False",
              "CSEExtensionName" => "Security",
             "ApplicableGPOList" => "<GPO ID={31B2F340-016D-11D2-945F-00C04FB984F9}><Name>Default Domain Policy</Name></GPO><GPO ID={534B3EFE-DA09-40FA-B138-C526C40F272D}><Name>Deny Logon locally for service account</Name></GPO>",
             "DescriptionString" => "Default Domain Policy, Deny Logon locally for service account, ",
                "CSEExtensionId" => "827d319e-6eac-11d2-a4ea-00c04f79f83a",
           "GPOListStatusString" => "%%4101",
    "IsExtensionAsyncProcessing" => "True"
},

in addition to all the stuff like

    "ChunkNumber" => "42",
   "PayloadData5" => nil,
 "ExecutableInfo" => nil,
    "TimeCreated" => "2021-07-01 09:20:40.6757225",
        "Channel" => "Microsoft-Windows-GroupPolicy/Operational"

thanks alot for helping out @Badger . I reserched more about the eventlog output and saw that lot of useful info was missing. So converted the evtx file using powershell. The single logline contains something like below. But the newline present between json fields is messing with parsing. Can you assist here?

{
"Id": 4720,
"Version": 0,
"Qualifiers": null,
"Level": 0,
"Task": 13824,
"Opcode": 0,
"Keywords": -9214364837600034816,
"RecordId": 18962,
"ProviderName": "Microsoft-Windows-Security-Auditing",
"ProviderId": "54849625-5478-4994-a5ba-3e3b0328c30d",
"LogName": "Security",
"ProcessId": 784,
"ThreadId": 816,
"MachineName": "DESKTOP-E46TUNE",
"UserId": null,
"TimeCreated": "/Date(1625113095126)/",
"ActivityId": "91591009-6d89-0003-2310-5991896dd701",
"RelatedActivityId": null,
"ContainerLog": "z:\user-creation-1log.evtx",
"MatchedQueryIds": [

],
"Bookmark": {

},
"LevelDisplayName": "Information",
"OpcodeDisplayName": "Info",
"TaskDisplayName": "User Account Management",
"KeywordsDisplayNames": [
"Audit Success"
],
"Properties": [
{
"Value": "naveen1"
},
{
"Value": "DESKTOP-E46TUNE"
},
{
"Value": "S-1-5-21-2249204231-554948959-1540642952-1002"
},
{
"Value": "S-1-5-21-2249204231-554948959-1540642952-1001"
},
{
"Value": "kriss"
},
{
"Value": "DESKTOP-E46TUNE"
},
{
"Value": 1064542
},
{
"Value": "-"
},
{
"Value": "naveen1"
},
{
"Value": "%%1793"
},
{
"Value": "-"
},
{
"Value": "%%1793"
},
{
"Value": "%%1793"
},
{
"Value": "%%1793"
},
{
"Value": "%%1793"
},
{
"Value": "%%1793"
},
{
"Value": "%%1794"
},
{
"Value": "%%1794"
},
{
"Value": "513"
},
{
"Value": "-"
},
{
"Value": "0x0"
},
{
"Value": "0x15"
},
{
"Value": "\r\n\t\t%%2080\r\n\t\t%%2082\r\n\t\t%%2084"
},
{
"Value": "%%1793"
},
{
"Value": "-"
},
{
"Value": "%%1797"
}
],
"Message": "A user account was created.\r\n\r\nSubject:\r\n\tSecurity ID:\t\tS-1-5-21-2249204231-554948959-1540642952-1001\r\n\tAccount Name:\t\tkriss\r\n\tAccount Domain:\t\tDESKTOP-E46TUNE\r\n\tLogon ID:\t\t0x103E5E\r\n\r\nNew Account:\r\n\tSecurity ID:\t\tS-1-5-21-2249204231-554948959-1540642952-1002\r\n\tAccount Name:\t\tnaveen1\r\n\tAccount Domain:\t\tDESKTOP-E46TUNE\r\n\r\nAttributes:\r\n\tSAM Account Name:\tnaveen1\r\n\tDisplay Name:\t\t\u003cvalue not set\u003e\r\n\tUser Principal Name:\t-\r\n\tHome Directory:\t\t\u003cvalue not set\u003e\r\n\tHome Drive:\t\t\u003cvalue not set\u003e\r\n\tScript Path:\t\t\u003cvalue not set\u003e\r\n\tProfile Path:\t\t\u003cvalue not set\u003e\r\n\tUser Workstations:\t\u003cvalue not set\u003e\r\n\tPassword Last Set:\t\u003cnever\u003e\r\n\tAccount Expires:\t\t\u003cnever\u003e\r\n\tPrimary Group ID:\t513\r\n\tAllowed To Delegate To:\t-\r\n\tOld UAC Value:\t\t0x0\r\n\tNew UAC Value:\t\t0x15\r\n\tUser Account Control:\t\r\n\t\tAccount Disabled\r\n\t\t\u0027Password Not Required\u0027 - Enabled\r\n\t\t\u0027Normal Account\u0027 - Enabled\r\n\tUser Parameters:\t\u003cvalue not set\u003e\r\n\tSID History:\t\t-\r\n\tLogon Hours:\t\tAll\r\n\r\nAdditional Information:\r\n\tPrivileges\t\t-"
}

Can you provide more detail on how you are converting it?

If you need to consume multiple lines as a single event you would use a multiline codec, but you may be able to get PS to produce a single line.

This is the command used for converting the .evtx log file to .json using powershell-

Get-WinEvent -Path .\log.evtx | ConvertTo-Json | Out-File log.json -Encoding utf8

Look at the Compress and Depth options to ConvertTo-Json. Try

ConvertTo-Json -Compress -Depth 100

The -compress option really did magic. But I am really not sure -Depth would be useful or not. The output of -Depth 1 Vs -Depth 10 is confusing me. Not really sure which one should be useful from logstash parsing perspective. Looks like if you go greater in Depth, the nesting of JSON increases. To me -Depth 2 seemed perfect.
Is there any good read to learn about nested JSON and dictionary parsing? I can't find it in official documentation.

Does this help?...

> '{ "1": { "2": {"3": {"4": { "5" : 1 }}}}}' | ConvertFrom-Json | ConvertTo-Json -Compress
{"1":{"2":{"3":"@{4=}"}}}

> '{ "1": { "2": {"3": {"4": { "5" : 1 }}}}}' | ConvertFrom-Json | ConvertTo-Json -Compress -Depth 3
{"1":{"2":{"3":{"4":"@{5=1}"}}}}

> '{ "1": { "2": {"3": {"4": { "5" : 1 }}}}}' | ConvertFrom-Json | ConvertTo-Json -Compress -Depth 4
{"1":{"2":{"3":{"4":{"5":1}}}}}

You need a value of -Depth as large as or larger than the depth of the deepest nesting in your JSON.

PowerShell objects can reference themselves, which could result in an infinite depth during the conversion to JSON. That's why the depth is limited. This SO Q&A provides more detail.

Thanks @Badger . This made things quite clearer. Then I started studying JSON formats correctly. Found some other kind of workaround. (for the benefit of anyone who wants to parse windows eventlogs using json in logstash )-

  1. Convert the eventlog .evtx file to json using get-winevent . Below command will convert all .evtx files into json with same name (but including .evtx in name, that can be renamed later). I found -Depth 2 the best suitable depth of JSON, higher than this was just nesting unwanted fields in json.

foreach ($file in Get-ChildItem -Path .) {Get-WinEvent -Path $file | ConvertTo-Json -Depth 2 -Compress | Out-File "$($file).json" -Encoding utf8 }

  1. Now the resulting json file comes as a single line in form of an array, something like-

[{"Id":4726,"Version":0,........,"something":"some"},{"Id":4733,.........,"something":"some"}]

Remove the beginning square brace [ and ending brace ] using SED.
3. The message section in the log contains newline characters \r\n, replace them with , using SED to avoid parts of one log getting splitted into multiple logs in elasticsearch. I need to relook this replacement and do something more useful.
4. Now replace },{ with }\r\n{ using SED.
5. Sometimes there will be repeating json keys with the name Value. Replace them one by one with value1, value2, value3 etc... . You can achieve this by using SED in loop without global substitution.

Thats all, the very basic, clumsy but prsable json logs are ready.
Now I am remaining with one challenge, The Message field I had filled with mutiple , to replace \r\n , the original value of Message key was-

"Message":"A member was added to a security-enabled global group.\r\n\r\nSubject:\r\n\tSecurity ID:\t\tS-1-5-21-2249204231-554948959-1540642952-1001\r\n\tAccount Name:\t\tkriss\r\n\tAccount Domain:\t\tDESKTOP-E46TUNE............trimmed....

It appears like this in kibana now after replacing \r\n with a ,

Message: A member was added to a security-enabled global group.,,Subject:, Security ID: S-1-5-21-2249204231-554948959-1540642952-1001......trimmed........

The portion of eventlog creating problem is the one containing details in GUI. Can't believe how microsoft ruined the logging system for the sake of nice GUI.
Message key's value has become kind of CSV now. Any ideas on how to make it more parsable?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.