Decode_xml fails on UTF-16 formatted input

Hi, I am facing an issue where xml_decode and xml_decode_wineventlog fail to parse input data when it is in UTF-16. The following errors are produced:

failed in decode_xml on the "winlog.event_data.TaskContent" field: error decoding XML field: xml: encoding "UTF-16" declared but Decoder.CharsetReader is nil
failed in decode_xml_wineventlog on the "winlog.event_data.TaskContent" field: error decoding XML field: xml: encoding "UTF-16" declared but Decoder.CharsetReader is nil

This is the processor config:

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - decode_xml_wineventlog:
      field: winlog.event_data.TaskContent
      target_field: ""

An example field which causes the error:

<?xml version="1.0" encoding="UTF-16"?>
<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
  <RegistrationInfo>
    <Date>2022-01-27T10:20:14.8230337</Date>
    <Author>mesposito</Author>
    <Description>says good morning</Description>
    <URI>\Good Morning</URI>
  </RegistrationInfo>
  <Triggers>
    <CalendarTrigger>
      <StartBoundary>2022-01-27T10:22:35</StartBoundary>
      <Enabled>true</Enabled>
      <ScheduleByDay>
        <DaysInterval>1</DaysInterval>
      </ScheduleByDay>
    </CalendarTrigger>
  </Triggers>
  ...

This field and data can be recreated by scheduling an event using the Windows Task Scheduler. It looks like this might be happening because encoding/xml throws an error if a CharsetReader is not defined before encountering a non-UTF-8 charset.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.