Winlogbeat may leak PublisherHandle, and lead to failure of sysmon uninstalling

When I use "github.com/elastic/beats/v7/winlogbeat" library to build my app which manages sysmon and reads events of sysmon from windows event system, I found the app uninstall sysmon failed. This is the minimal code to recurrent the issue:

package main

import (
	"context"
	"fmt"
	"os/exec"
	"sync"
	"time"

	"github.com/elastic/beats/v7/libbeat/common"
	"github.com/elastic/beats/v7/winlogbeat/checkpoint"
	"github.com/elastic/beats/v7/winlogbeat/config"
	"github.com/elastic/beats/v7/winlogbeat/eventlog"
	"github.com/elastic/go-ucfg/yaml"
)

func main() {

	fmt.Printf("startup %v\n", time.Now())
	fmt.Printf("install begin %v\n", time.Now())
	exec.Command(".\\Sysmon64.exe", "-i", "-accepteula").CombinedOutput()
	fmt.Printf("install done %v\n", time.Now())

	cfg, err := yaml.NewConfigWithFile("config.yaml")
	if err != nil {
		return
	}
	beatConfig := config.DefaultSettings
	err = cfg.Unpack(&beatConfig)
	if err != nil {
		return
	}
	cp, err := checkpoint.NewCheckpoint(beatConfig.RegistryFile, beatConfig.RegistryFlush)
	if err != nil {
		return
	}
	wg := sync.WaitGroup{}
	ctx, cancel := context.WithCancel(context.Background())
	for _, option := range beatConfig.EventLogs {
		wg.Add(1)
		go func(option *common.Config) {
			defer wg.Done()
			fmt.Printf("handle config %+v\n", option)
			defer fmt.Printf("handle config %+v done\n", option)
			eventLog, err := eventlog.New(option)
			if err != nil {
				return
			}
			err = eventLog.Open(cp.States()[eventLog.Name()])
			if err != nil {
				return
			}
			go func() {
				<-ctx.Done()
				eventLog.Close()
			}()
			for {
				records, err := eventLog.Read()
				if err != nil {
					break
				}
				for _, r := range records {
					cp.PersistState(r.Offset)
				}
				time.Sleep(1 * time.Second)
			}
		}(option)
	}
	time.Sleep(time.Second * 5)
	cancel()
	wg.Wait()
	cp.Shutdown()
	fmt.Printf("uninstall begin %v\n", time.Now())
	exec.Command(".\\Sysmon64.exe", "-u").CombinedOutput()
	fmt.Printf("uninstall done %v\n", time.Now())
}

The content of config.yaml is:

registry_file: .eventlog.yml
registry_flush: 60s
event_logs:
  - name: Microsoft-Windows-Sysmon/Operational
    api: wineventlog
    ignore_older: 24h
    mapping_base: 3600

And Sysmon64.exe is the latest version (v14.1) of sysmon.
This is output of the app:

startup 2022-10-10 23:41:11.5465082 +0800 CST m=+0.025312701
install begin 2022-10-10 23:41:11.5470393 +0800 CST m=+0.025843801
install done 2022-10-10 23:41:12.8880663 +0800 CST m=+1.366875001
handle config &{ctx:{parent:{c:0xc00021ec90} field:0} metadata:0xc000089f30 fields:0xc000096d80}
handle config &{ctx:{parent:{c:0xc00021ec90} field:0} metadata:0xc000089f30 fields:0xc000096d80} done
uninstall begin 2022-10-10 23:41:18.2803849 +0800 CST m=+6.759210401
uninstall done 2022-10-10 23:41:48.5885503 +0800 CST m=+37.067470201 stdout:

System Monitor v14.1 - System activity monitor
By Mark Russinovich and Thomas Garnier
Copyright (C) 2014-2022 Microsoft Corporation
Using libxml2. libxml2 is Copyright (C) 1998-2012 Daniel Veillard. All Rights Reserved.
Sysinternals - www.sysinternals.com

Stopping Sysmon64.
Sysmon64 stopped.
Sysmon64 removed.
Stopping SysmonDrv.
SysmonDrv stopped.
SysmonDrv removed.
Removing service files...............................
Failed to delete C:\Windows\Sysmon64.exe
 <nil>

The uninstall process takes a long time to execute, and failed. I think some resources of winlogbeat used by app aren't closed or freed, so I make a lot of tests to find wihch part of code causes this issue. Finally I confirm:
In winlogbeat/eventlog.winEventLog.Read, winlogbeat/eventlog.winEventLog.render is called

func (l *winEventLog) Read() ([]Record, error) {
	handles, _, err := l.eventHandles(l.maxRead)
	if err != nil || len(handles) == 0 {
		return nil, err
	}
	defer func() {
		for _, h := range handles {
			win.Close(h)
		}
	}()
	detailf("%s EventHandles returned %d handles", l.logPrefix, len(handles))

	var records []Record //nolint:prealloc // This linter gives bad advice and does not take into account conditionals in loops.
	for _, h := range handles {
		l.outputBuf.Reset()
		err := l.render(h, l.outputBuf)
		...
	}
...
}

And in eventlog.winEventLog.render (in this case, it is winlogbeat/sys/wineventlog.RenderEvent):

func RenderEvent(
	eventHandle EvtHandle,
	lang uint32,
	renderBuf []byte,
	pubHandleProvider func(string) sys.MessageFiles,
	out io.Writer,
) error {
	providerName, err := evtRenderProviderName(renderBuf, eventHandle)
	if err != nil {
		return err
	}

	var publisherHandle uintptr
	if pubHandleProvider != nil {
		messageFiles := pubHandleProvider(providerName)
		if messageFiles.Err == nil {
			// There is only ever a single handle when using the Windows Event
			// Log API.
			publisherHandle = messageFiles.Handles[0].Handle
		}
	}

	// Only a single string is returned when rendering XML.
	err = FormatEventString(EvtFormatMessageXml,
		eventHandle, providerName, EvtHandle(publisherHandle), lang, renderBuf, out)
	// Recover by rendering the XML without the RenderingInfo (message string).
	if err != nil {
		// Do not try to recover from InsufficientBufferErrors because these
		// can be retried with a larger buffer.
		if errors.Is(err, sys.InsufficientBufferError{}) {
			return err
		}

		err = RenderEventXML(eventHandle, renderBuf, out)
	}

	return err
}

The publisherHandle is passed to the winlogbeat/sys/wineventlog.FormatEventString:

func FormatEventString(
	messageFlag EvtFormatMessageFlag,
	eventHandle EvtHandle,
	publisher string,
	publisherHandle EvtHandle,
	lang uint32,
	buffer []byte,
	out io.Writer,
) error {
	// Open a publisher handle if one was not provided.
	ph := publisherHandle
	if ph == 0 {
		var err error
		ph, err = OpenPublisherMetadata(0, publisher, lang)
		if err != nil {
			return err
		}
		defer _EvtClose(ph) //nolint:errcheck // This is just a resource release.
	}
	// Create a buffer if one was not provided.
	var bufferUsed uint32
	if buffer == nil {
		err := _EvtFormatMessage(ph, eventHandle, 0, 0, 0, messageFlag,
			0, nil, &bufferUsed)
		if err != nil && err != ERROR_INSUFFICIENT_BUFFER { //nolint:errorlint // This is an errno or nil.
			return err
		}

		bufferUsed *= 2
		buffer = make([]byte, bufferUsed)
		bufferUsed = 0
	}

	err := _EvtFormatMessage(ph, eventHandle, 0, 0, 0, messageFlag,
		uint32(len(buffer)/2), &buffer[0], &bufferUsed)
	bufferUsed *= 2
	if err == ERROR_INSUFFICIENT_BUFFER { //nolint:errorlint // This is an errno or nil.
		return sys.InsufficientBufferError{Cause: err, RequiredSize: int(bufferUsed)}
	}
	if err != nil {
		return err
	}

	// This assumes there is only a single string value to read. This will
	// not work to read keys (when messageFlag == EvtFormatMessageKeyword).
	return common.UTF16ToUTF8Bytes(buffer[:bufferUsed], out)
}

And the local var ph is the publisherHandle. When publisherHandle is null/0, FormatEventString calls OpenPublisherMetadata to open one and defer close it. But when publisherHandle is not null/0, there does nothing, and never close it!

In FormatEventString, ph also is passed to _EvtFormatMessage which is a syscall wrapper, but _EvtFormatMessage neither closes it.

In RenderEvent, publisherHandle is not passes to any other function. It means that the FormatEventString must close it, or it is leaked.

About why close the Publisher Handle: EvtOpenPublisherMetadata function (winevt.h) - Win32 apps | Microsoft Learn
"You must call the EvtClose function to close the metadata handle when done." is metioned.

After I modify the FormatEventString like this:

func FormatEventString(
	messageFlag EvtFormatMessageFlag,
	eventHandle EvtHandle,
	publisher string,
	publisherHandle EvtHandle,
	lang uint32,
	buffer []byte,
	out io.Writer,
) error {
	// Open a publisher handle if one was not provided.
	ph := publisherHandle
	if ph == 0 {
		var err error
		ph, err = OpenPublisherMetadata(0, publisher, lang)
		if err != nil {
			return err
		}
	}
	defer _EvtClose(ph) //nolint:errcheck // This is just a resource release.
	// Create a buffer if one was not provided.
	var bufferUsed uint32
...
}

And the sysmon issue is never found again. If this issue is confirmed, I'm very happy to open a PR of github repo to fix it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.