Logstash Fixing Incoming Json Log to Parse


(Hüseyin Fatih Akar) #1

Hello there.

I am sending some logs from my siem to Logstash. My siem adds some string to the log which causes Logstash to fail on parsing. For days I have tried to not allow my siem to not to put some header string but I coulnot found a way.

This is the log:
{
"_index": "yhkrtl-snort:logstash-syslog-2018.10.11",
"_type": "doc",
"_id": "4AaTY2YBcyVXAtr474Mm",
"_version": 1,
"_score": null,
"_source": {
"port": 49640,
"logstash_time": 50.578811168670654,
"@version": "1",
"message": "<01>- hostname {"name":"Sysmon","version":"1.0","isoTimeFormat":"yyyy-MM-dd'T'HH:mm:ss.SSSZ","type":"Event","ParentImageName":"ccSvcHst.exe","ImageName":"cscript.exe","Process CommandLine":"C:\\Windows\\system32\\cscript.exe //Job:AgentHIScript \"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin64\\AVScript8.js\" \"15562\" \"Helper.exe\" \"Symantec.SSHelper\" \"C:\" \"22\" \"C:\\PROGRA~2\\Symantec\\SYMANT~1\\140392~1.105\\Temp\\\" \"0\" //E:JScript","SHA256 Hash":"E886D8860CC05F341D764B27277E10091D0B5E313E086C9B93C75E5FB6F3F4B5","File Hash":"17E650E888D57AB51E9C3494E49A2045","ParentImage":"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin\\ccSvcHst.exe","ParentCommandLine":"\"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin\\ccSvcHst.exe\" /s \"Symantec Endpoint Protection\" /m \"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin\\sms.dll\" /prefetch:1","Image":"C:\\Windows\\System32\\cscript.exe","Parent Process Guid":"B3870CB9-773E-5BBC-0000-0010E8190300","Process Guid":"B3870CB9-600E-5BBF-0000-0010338A4A16","Process Id":"5852"}",
"@timestamp": "2018-10-11T14:38:50.264Z",
"host": "yhkrtl-qradar.yildiz.domain",
"tags": [
"_jsonparsefailure",
"syslogng",
"syslog"
]
},
"fields": {
"@timestamp": [
"2018-10-11T14:38:50.264Z"
]
},
"highlight": {
"message": [
"<01>- hostname {"name":"@kibana-highlighted-field@Sysmon@/kibana-highlighted-field@","version":"1.0","isoTimeFormat":"yyyy-MM-dd'T'HH:mm:ss.SSSZ","type":"Event","ParentImageName":"ccSvcHst.exe","ImageName":"cscript.exe","Process CommandLine":"C:\\Windows\\system32\\cscript.exe //Job:AgentHIScript \"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin64\\AVScript8.js\" \"15562\" \"Helper.exe\" \"Symantec.SSHelper\" \"C:\" \"22\" \"C:\\PROGRA~2\\Symantec\\SYMANT~1\\140392~1.105\\Temp\\\" \"0\" //E:JScript","SHA256 Hash":"E886D8860CC05F341D764B27277E10091D0B5E313E086C9B93C75E5FB6F3F4B5","File Hash":"17E650E888D57AB51E9C3494E49A2045","ParentImage":"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin\\ccSvcHst.exe","ParentCommandLine":"\"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin\\ccSvcHst.exe\" /s \"Symantec Endpoint Protection\" /m \"C:\\Program Files (x86)\\Symantec\\Symantec Endpoint Protection\\14.0.3929.1200.105\\Bin\\sms.dll\" /prefetch:1","Image":"C:\\Windows\\System32\\cscript.exe","Parent Process Guid":"B3870CB9-773E-5BBC-0000-0010E8190300","Process Guid":"B3870CB9-600E-5BBF-0000-0010338A4A16","Process Id":"5852"}"
]
},
"sort": [
1539268730264
]
}

As you can see siem adds "<01>- hostname " to the message field. Can I replace that so that logstash can parse and index it? Or should I do anything else?

This is the logstash error log:

at [Source: (String)"<13>Oct 11 17:46:22 10.55.1.11 AgentDevice=WindowsLog     AgentLogFile=Microsoft-Windows-Sysmon/Operational       PluginVersion=7.2.8.91  Source=Microsoft-Windows-Sysmon Computer=YHKRTLDC01.YILDIZ.DOMAIN       OriginatingComputer=10.55.1.11       User=SYSTEM     Domain=NT AUTHORITY     EventID=1       EventIDCode=1   EventType=4     EventCategory=1 RecordNumber=2895152    TimeGenerated=1539269160        TimeWritten=1539269160  Level=Informational Keywords=0x8000000000000000      Task=SysmonTask-SYSMON_CREATE_PROCESS   Opcode=Info     Message=Process Create: R"[truncated 1244 chars]; line: 1, column: 2]>, :data=>"<13>Oct 11 17:46:22 10.55.1.11 AgentDevice=WindowsLog\tAgentLogFile=Microsoft-Windows-Sysmon/Operational\tPluginVersion=7.2.8.91\tSource=Microsoft-Windows-Sysmon\tComputer=YHKRTLDC01.YILDIZ.DOMAIN\tOriginatingComputer=10.55.1.11\tUser=SYSTEM\tDomain=NT AUTHORITY\tEventID=1\tEventIDCode=1\tEventType=4\tEventCategory=1\tRecordNumber=2895152\tTimeGenerated=1539269160\tTimeWritten=1539269160\tLevel=Informational\tKeywords=0x8000000000000000\tTask=SysmonTask-SYSMON_CREATE_PROCESS\tOpcode=Info\tMessage=Process Create: RuleName:  UtcTime: 2018-10-11 14:46:00.015 ProcessGuid: {199A51F7-6228-5BBF-0000-00100425B9F8} ProcessId: 10868 Image: C:\\Windows\\System32\\cscript.exe FileVersion: 5.8.9600.16384 Description: Microsoft ® Console Based Script Host Product: Microsoft ® Windows Script Host Company: Microsoft Corporation CommandLine: \"C:\\Windows\\system32\\cscript.exe\" //nologo \"C:\\Program Files\\Microsoft Monitoring Agent\\Agent\\Health Service State\\Monitoring Host Temporary Files 264324\\9955\\AD_Replication_Partner_Op_Master_Consistency.vbs\" YHKRTLDC01.YILDIZ.DOMAIN false 4 {9A6C41D7-1F52-526D-D104-A290917F7514} CurrentDirectory: C:\\Program Files\\Microsoft Monitoring Agent\\Agent\\Health Service State\\Monitoring Host Temporary Files 264324\\9955\\ User: NT AUTHORITY\\SYSTEM LogonGuid: {199A51F7-3AB0-5BB7-0000-0020E7030000} LogonId: 0x3E7 TerminalSessionId: 0 IntegrityLevel: System Hashes: MD5=17E650E888D57AB51E9C3494E49A2045,SHA256=E886D8860CC05F341D764B27277E10091D0B5E313E086C9B93C75E5FB6F3F4B5 ParentProcessGuid: {199A51F7-3B09-5BB7-0000-00104AD10A00} ParentProcessId: 5432 ParentImage: C:\\Program Files\\Microsoft Monitoring Agent\\Agent\\MonitoringHost.exe ParentCommandLine: \"C:\\Program Files\\Microsoft Monitoring Agent\\Agent\\MonitoringHost.exe\" -Embedding"}

[2018-10-11T14:45:50,087][WARN ][logstash.codecs.jsonlines] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')


(Guy Boertje) #2

Yes you can.

But you appear to have mixed logs. The first two shown does have the <01>- hostname prelude to JSON message and the error message shows a line having a <13>Oct 11 17:46:22 10.55.1.11 prelude to a KV message. AFAICT they are syslog formatted messages.

You need to remove the JSON codec.

You can use grok or dissect to parse out the prelude.
After you need to use a conditional to steer the JSON types to a JSON filter branch with the else branch handling the KV types.

Play with this config until you get the correct shape.

input {
  generator {
    lines => [
      '<01>- localhost {"foo": 11, "bar": 12}',
      '<13>Oct 11 17:46:22 10.55.1.11 AgentDevice=WindowsLog   PluginVersion=7.2.8.91'
    ]
    count => 1
  }
}

filter {
  grok {
    match => {
      "message" => [
        '^<%{INT:priority:int}>- %{DATA:ip_or_host} %{DATA:kv_or_json}$',
        '^<%{INT:priority:int}>%{SYSLOGTIMESTAMP:timestamp} %{IPORHOST:ip_or_host} %{DATA:kv_or_json}$'
      ]
    }
    break_on_match => true
  }
  if [kv_or_json] =~ /^\{/ {
    json {
      source => '[kv_or_json]'
    }
  } else {
    kv {
     source => "[kv_or_json]"
    }
  }
}

output {
  stdout {
    codec => rubydebug
  }
}

The events look like this.

{
         "@version" => "1",
             "host" => "Elastics-MacBook-Pro.local",
       "kv_or_json" => "AgentDevice=WindowsLog   PluginVersion=7.2.8.91",
        "timestamp" => "Oct 11 17:46:22",
    "PluginVersion" => "7.2.8.91",
       "ip_or_host" => "10.55.1.11",
      "AgentDevice" => "WindowsLog",
         "sequence" => 0,
         "priority" => 13,
       "@timestamp" => 2018-10-22T11:44:23.051Z,
          "message" => "<13>Oct 11 17:46:22 10.55.1.11 AgentDevice=WindowsLog   PluginVersion=7.2.8.91"
}
{
      "@version" => "1",
           "bar" => 12,
           "foo" => 11,
      "sequence" => 0,
          "host" => "Elastics-MacBook-Pro.local",
    "kv_or_json" => "{\"foo\": 11, \"bar\": 12}",
      "priority" => 1,
    "@timestamp" => 2018-10-22T11:44:23.029Z,
    "ip_or_host" => "localhost",
       "message" => "<01>- localhost {\"foo\": 11, \"bar\": 12}"
}

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.