Hi, I am facing an issue where xml_decode
and xml_decode_wineventlog
fail to parse input data when it is in UTF-16. The following errors are produced:
failed in decode_xml on the "winlog.event_data.TaskContent" field: error decoding XML field: xml: encoding "UTF-16" declared but Decoder.CharsetReader is nil
failed in decode_xml_wineventlog on the "winlog.event_data.TaskContent" field: error decoding XML field: xml: encoding "UTF-16" declared but Decoder.CharsetReader is nil
This is the processor config:
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- decode_xml_wineventlog:
field: winlog.event_data.TaskContent
target_field: ""
An example field which causes the error:
<?xml version="1.0" encoding="UTF-16"?>
<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
<RegistrationInfo>
<Date>2022-01-27T10:20:14.8230337</Date>
<Author>mesposito</Author>
<Description>says good morning</Description>
<URI>\Good Morning</URI>
</RegistrationInfo>
<Triggers>
<CalendarTrigger>
<StartBoundary>2022-01-27T10:22:35</StartBoundary>
<Enabled>true</Enabled>
<ScheduleByDay>
<DaysInterval>1</DaysInterval>
</ScheduleByDay>
</CalendarTrigger>
</Triggers>
...
This field and data can be recreated by scheduling an event using the Windows Task Scheduler. It looks like this might be happening because encoding/xml
throws an error if a CharsetReader is not defined before encountering a non-UTF-8 charset.