Winlogbeat experimental api missmatch between paresed fields and rendered event

Hi folks,

in Winlogbeat I recently switched from the default windows event log API to the experimental one, because winlogbeat was capping out at roughly 1800 events per second and our Windows event collector getting a lot more than that. While the EPS increase is truly great, I noticed a very weird behaviour afterwards, which I would consider a bug.

With the new API there is a mismatch between the parsed fields and and the rendered content field. They simply don't match each other as if they were taken from two different events. For example these two are fields from the same event in my Elastic:

As you can see, the rendered content doesn't match the parsed fields. This only happens using the experimental API to ingest the events in winlogbeat. The moment I switch back to the classical API the fields match but then I can't get enough performance to keep up with my inbound logs.

My winlogbeat config is pretty straight forward, just pick up everything from Forwarded Events, use the security parser and hand off to a Kafka queue.
Winlogbeat config snippet

Within the Windows event log the logs are fine but right after winlogbeat on Kafka they are already mixed up like shown above.

Can someone confirm this mismatch using the experimental API? If so, I probably would consider this a bug and raise an issue on Github.

Best regards,
Sebastian

Can you share the full document JSON (the _source part) with redactions as needed.

Sure, a tiny bit redacted but mostly complete:

"_source": {
    "winlog": {
      "event_data": {
        "TargetOutboundDomainName": "-",
        "TargetOutboundUserName": "-",
        "TargetLinkedLogonId": "0x0",
        "SubjectLogonId": "0x0",
        "SubjectDomainName": "-",
        "LogonType": "3",
        "RestrictedAdminMode": "-",
        "TargetUserName": "RUSARREDACT$",
        "AuthenticationPackageName": "Kerberos",
        "VirtualAccount": "%%1843",
        "ImpersonationLevel": "%%1833",
        "TargetUserSid": "S-1-5-21-1417001333-1111-2222-828430",
        "KeyLength": "0",
        "LmPackageName": "-",
        "ElevatedToken": "%%1842",
        "TargetDomainName": "REDACT",
        "TargetLogonId": "0x27486b0b3",
        "SubjectUserName": "-",
        "SubjectUserSid": "S-1-0-0",
        "LogonProcessName": "Kerberos",
        "TransmittedServices": "-",
        "LogonGuid": "{ef061661-680d-fa70-8b19-b355dd7d1845}"
      },
      "provider_guid": "{54849625-5478-4994-A5BA-3E3B0328C30D}",
      "record_id": 2886739652,
      "channel": "Security",
      "provider_name": "Microsoft-Windows-Security-Auditing",
      "task": "Logon",
      "process": {
        "thread": {
          "id": 4292
        },
        "pid": 1460
      },
      "keywords": [
        "Audit Success"
      ],
      "event_id": "4624",
      "logon": {
        "type": "Network",
        "id": "0x27486b0b3"
      },
      "api": "wineventlog-experimental",
      "computer_name": "REDACT",
      "version": 2,
      "opcode": "Info"
    },
    "ecs": {
      "version": "1.11.0"
    },
    "agent": {
      "name": "redact",
      "type": "winlogbeat",
      "ephemeral_id": "4ffc6120-79ef-4314-b54a-49525aa2330b",
      "id": "156ae148-2948-45e1-97b6-acc171f2d470",
      "version": "7.15.0",
      "hostname": "redact"
    },
    "related": {
      "ip": "10.33.152.202",
      "user": "RUSARREDACT$"
    },
    "user": {
      "name": "RUSARREDACT$",
      "id": "S-1-5-21-1417001333-1111-2222-828430",
      "domain": "REDACT"
    },
    "host": {},
    "@version": "1",
    "process": {
      "name": "-",
      "executable": "-",
      "pid": 0
    },
    "observer": {
      "vendor": "Microsoft",
      "type": "srv_os_windows",
      "hostname": "REDACT"
    },
    "log": {
      "level": "information",
      "original": "An account was successfully logged on.\n\nSubject:\n\tSecurity ID:\t\tS-1-0-0\n\tAccount Name:\t\t-\n\tAccount Domain:\t\t-\n\tLogon ID:\t\t0x0\n\nLogon Information:\n\tLogon Type:\t\t3\n\tRestricted Admin Mode:\t-\n\tVirtual Account:\t\tNo\n\tElevated Token:\t\tYes\n\nImpersonation Level:\t\tImpersonation\n\nNew Logon:\n\tSecurity ID:\t\tS-1-5-21-1417001333-1111-2222-1013716\n\tAccount Name:\t\tMXINTREDACT$\n\tAccount Domain:\t\tREDACT\n\tLogon ID:\t\t0x7FB65FC4\n\tLinked Logon ID:\t\t0x0\n\tNetwork Account Name:\t-\n\tNetwork Account Domain:\t-\n\tLogon GUID:\t\t{030882ab-e781-7c48-0675-83a721d41e5f}\n\nProcess Information:\n\tProcess ID:\t\t0x0\n\tProcess Name:\t\t-\n\nNetwork Information:\n\tWorkstation Name:\t-\n\tSource Network Address:\t10.147.27.104\n\tSource Port:\t\t57921\n\nDetailed Authentication Information:\n\tLogon Process:\t\tKerberos\n\tAuthentication Package:\tKerberos\n\tTransited Services:\t-\n\tPackage Name (NTLM only):\t-\n\tKey Length:\t\t0\n\nThis event is generated when a logon session is created. It is generated on the computer that was accessed.\n\nThe subject fields indicate the account on the local system which requested the logon. This is most commonly a service such as the Server service, or a local process such as Winlogon.exe or Services.exe.\n\nThe logon type field indicates the kind of logon that occurred. The most common types are 2 (interactive) and 3 (network).\n\nThe New Logon fields indicate the account for whom the new logon was created, i.e. the account that was logged on.\n\nThe network fields indicate where a remote logon request originated. Workstation name is not always available and may be left blank in some cases.\n\nThe impersonation level field indicates the extent to which a process in the logon session can impersonate.\n\nThe authentication information fields provide detailed information about this specific logon request.\n\t- Logon GUID is a unique identifier that can be used to correlate this event with a KDC event.\n\t- Transited services indicate which intermediate services have participated in this logon request.\n\t- Package name indicates which sub-protocol was used among the NTLM protocols.\n\t- Key length indicates the length of the generated session key. This will be 0 if no session key was requested."
    },
    "event": {
      "created": "2022-04-27T12:41:34.641Z",
      "code": "4624",
      "module": "security",
      "kind": "event",
      "type": [
        "start"
      ],
      "category": [
        "authentication"
      ],
      "ingested": "2022-04-27T12:41:38.585Z",
      "provider": "Microsoft-Windows-Security-Auditing",
      "outcome": "success",
      "action": "logged-in"
    },
    "source": {
      "port": 61291,
      "ip": "10.33.152.202",
      "address_space": "Private IP",
      "domain": "-"
    }
  }

I also tried to do some scoping. Looks like this doesn't happen on all Windows events. The only event ID I could confirm it on is Windows event ID 4624 (successful login).

Plus it looks like the rendered text (log.original) is always the same over multiple events. Cloud it maybe be some caching issue? The original Github issue about the new experimental API mentions that there is caching in place https://github.com/elastic/beats/pull/17022

Where is log.original coming from? That's not a field that Winlogbeat produces AFAIK (it uses message).

That comes out of my own parsing chain. Since I favor logstash over Elastic ingest pipelines and have a big logstash pipeline for multiple things in place my setup (for windows logs) is like this:

Windows Event Collection => Winlogbeat => Kafka => Logstash => Elastic

One of the very few things done on Logstash is renaming message to log.original. Haven't caught up to the latest ECS versions (log.original vs. event.original) yet.

The issue is already visible on Kafka though. So it is clearly something coming out of Winlogbeat.