I will receive a log that contained JSON and XML format
I configurated the Logstash to receive TCP in JSON and used the filter when shown dateparsefailure in tags.
The JSON part is able to extract, but the xml part can't.
Could anyone give me some advice?
Thanks
Anyone have idea?
Please do not post pictures of text. They cannot be searched, are inaccessible to some people and I cannot copy and paste them to experiment with how to fix any problems. Just post the text.
Sorry, my fault.
This is the Log in message
<Event><System><Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/><EventID>5</EventID><Version>3</Version><Level>4</Level><Task>5</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime="2021-11-24T10:00:00.624971000Z"/><EventRecordID>227402</EventRecordID><Correlation/><Execution ProcessID="1095" ThreadID="1095"/><Channel>Linux-Sysmon/Operational</Channel><Computer>ubuntu</Computer><Security UserId="0"/></System><EventData><Data Name="RuleName">-</Data><Data Name="UtcTime">2021-11-23 02:56:47.183</Data><Data Name="ProcessGuid">{3b95acb1-4b99-619b-9535-d747b4550000}</Data><Data Name="ProcessId">732</Data><Data Name="Image">/usr/sbin/multipathd</Data><Data Name="User">root</Data></EventData></Event>
here is the filter.conf
filter {
if "_dateparsefailure" in [tags] {
xml {
store_xml => "false"
source => "Message"
xpath => [
"//Event/System/EventID/text()","Event_id"
]
}
}
}
logstash input conf
input {
tcp {
codec => json_lines { charset => CP1252 }
port => "5444"
}
}
filter {
date {
locale => "en"
timezone => "Etc/GMT"
match => [ "EventTime", "YYYY-MM-dd HH:mm:ss" ]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "nxlog-linux-%{+YYYY.MM.dd}"
}
stdout { codec => rubydebug }
}
Thanks for the help.
If I run logstash with
input { generator { count => 1 lines => [ '<Event><System><Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/><EventID>5</EventID><Version>3</Version><Level>4</Level><Task>5</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime="2021-11-24T10:00:00.624971000Z"/><EventRecordID>227402</EventRecordID><Correlation/><Execution ProcessID="1095" ThreadID="1095"/><Channel>Linux-Sysmon/Operational</Channel><Computer>ubuntu</Computer><Security UserId="0"/></System><EventData><Data Name="RuleName">-</Data><Data Name="UtcTime">2021-11-23 02:56:47.183</Data><Data Name="ProcessGuid">{3b95acb1-4b99-619b-9535-d747b4550000}</Data><Data Name="ProcessId">732</Data><Data Name="Image">/usr/sbin/multipathd</Data><Data Name="User">root</Data></EventData></Event>' ] } }
filter {
xml {
store_xml => "false"
source => "message"
xpath => { "//Event/System/EventID/text()" => "Event_id" }
}
}
output { stdout { codec => rubydebug { metadata => false } } }
then I get
"Event_id" => [
[0] "5"
],
It looks OK to me.
Thanks for the reply.
As I am forwarding the JSON log to Logstash using tcp, the whole message is shown as below:
Nov 30 05:28:02 ubuntu sysmon: <Event><System><Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/><EventID>1</EventID><Version>5</Version><Level>4</Level><Task>1</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime="2021-11-30T05:28:02.358007000Z"/><EventRecordID>274376</EventRecordID><Correlation/><Execution ProcessID="1095" ThreadID="1095"/><Channel>Linux-Sysmon/Operational</Channel><Computer>ubuntu</Computer><Security UserId="0"/></System><EventData><Data Name="RuleName">-</Data><Data Name="UtcTime">2021-11-24 07:14:26.940</Data><Data Name="ProcessGuid">{3b95acb1-e652-619d-31b4-afeedb550000}</Data><Data Name="ProcessId">77439</Data><Data Name="Image">/usr/bin/cat</Data><Data Name="FileVersion">-</Data><Data Name="Description">-</Data><Data Name="Product">-</Data><Data Name="Company">-</Data><Data Name="OriginalFileName">-</Data><Data Name="CommandLine">cat syslog</Data><Data Name="CurrentDirectory">/home/ubuntu</Data><Data Name="User">ubuntu</Data><Data Name="LogonGuid">{3b95acb1-b63f-61a5-e803-000000000000}</Data><Data Name="LogonId">1000</Data><Data Name="TerminalSessionId">68</Data><Data Name="IntegrityLevel">no level</Data><Data Name="Hashes">-</Data><Data Name="ParentProcessGuid">{3b95acb1-e62f-619d-05d7-79d127560000}</Data><Data Name="ParentProcessId">77404</Data><Data Name="ParentImage">/usr/bin/bash</Data><Data Name="ParentCommandLine">-bash</Data><Data Name="ParentUser">ubuntu</Data></EventData></Event>
Nov 30 05:28:02 ubuntu sysmon: <Event><System><Provider Name="Linux-Sysmon" Guid="{ff032593-a8d3-4f13-b0d6-01fc615a0f97}"/><EventID>5</EventID><Version>3</Version><Level>4</Level><Task>5</Task><Opcode>0</Opcode><Keywords>0x8000000000000000</Keywords><TimeCreated SystemTime="2021-11-30T05:28:02.359124000Z"/><EventRecordID>274377</EventRecordID><Correlation/><Execution ProcessID="1095" ThreadID="1095"/><Channel>Linux-Sysmon/Operational</Channel><Computer>ubuntu</Computer><Security UserId="0"/></System><EventData><Data Name="RuleName">-</Data><Data Name="UtcTime">2021-11-24 07:14:26.941</Data><Data Name="ProcessGuid">{3b95acb1-e652-619d-31b4-afeedb550000}</Data><Data Name="ProcessId">77439</Data><Data Name="Image">/usr/bin/cat</Data><Data Name="User">ubuntu</Data></EventData></Event>
For the message field that's XML and dynamic data, how can I set up the logstash.conf to extract the message field for dynamic?
Thanks a lot.
You would use grok to extract the xml from the log line. But from your original kibana screenshot it looks like you have already done that.
As I am not familiar with using the grok, could you help to take an example of my case that how to extract the message field.
Thanks a lot.
You could try
grok { match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:hostname} %{WORD}: %{GREEDYDATA:theXML}" } }
filter {
grok
{
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{WORD:hostname} %{WORD}: %{GREEDYDATA:Message}" }
}
xml {
store_xml => "false"
source => "Message"
}
}
it show _grokparsefailure, _dateparsefailure.
Should i receive the log via Json? or just let it as raw or xml?
What does the [message] field look like? (Not the [Message] field, that is different.) How does the [Message] field get created?
The whole log looks like this:
As the raw log is JSON format, so it could extract some parts of data. but there are some raw log which is in XML format that shows in the [Message] field, it should be created by parsing error
So, i just want to extract these XML log field
Thanks.
OK, so forget grok, just do
xml {
store_xml => "false"
source => "Message"
xpath => { "//Event/System/EventID/text()" => "Event_id" }
}
to parse the [Message] (not [message]) field.
It works!
Thanks bro.
Just one more things, How can i extract the data in
<Event>
<EventData>
<Data Name="LogonId">1000</Data>
</EventData>
</Event>
Thanks a lots
Sorry, i just figure out the way.
Thanks man.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.