Beats 7.7.0 wont run on some of our Windows clients

trying to update our beats v.7.6.2 to 7.7.0, but for some reasons it seems that ~25-30% of them fails to launch the service after installing. v.7.6.2 was running just fine.

Debug log from an launch attempt of AuditBeat 7.7.0:

2020-05-26T13:22:02.227+0200	INFO	instance/beat.go:621	Home path: [C:\Program Files\AuditBeat] Config path: [C:\Program Files\AuditBeat] Data path: [C:\Program Files\AuditBeat\data] Logs path: [C:\Program Files\AuditBeat\logs]
2020-05-26T13:22:02.227+0200	DEBUG	[beat]	instance/beat.go:673	Beat metadata path: C:\Program Files\AuditBeat\data\meta.json
2020-05-26T13:22:02.228+0200	INFO	instance/beat.go:629	Beat ID: d4284548-792e-4eb0-84df-ce7fc1d9e409
2020-05-26T13:22:02.253+0200	DEBUG	[conditions]	conditions/conditions.go:98	New condition regexp: map[]
2020-05-26T13:22:02.254+0200	DEBUG	[conditions]	conditions/conditions.go:98	New condition regexp: map[]
2020-05-26T13:22:02.254+0200	DEBUG	[conditions]	conditions/conditions.go:98	New condition regexp: map[] or regexp: map[]
2020-05-26T13:22:02.254+0200	DEBUG	[processors]	processors/processor.go:101	Generated new processors: add_host_metadata=[netinfo.enabled=[true], cache.ttl=[5m0s]], add_fields={"host":{"<redacted>":{"assetid":<redacted>,"customerid":<redacted>}}}, drop_event, condition=regexp: map[] or regexp: map[]
2020-05-26T13:22:02.254+0200	DEBUG	[seccomp]	seccomp/seccomp.go:96	Syscall filtering is only supported on Linux
2020-05-26T13:22:02.254+0200	INFO	[beat]	instance/beat.go:957	Beat info	{"system_info": {"beat": {"path": {"config": "C:\\Program Files\\AuditBeat", "data": "C:\\Program Files\\AuditBeat\\data", "home": "C:\\Program Files\\AuditBeat", "logs": "C:\\Program Files\\AuditBeat\\logs"}, "type": "auditbeat", "uuid": "d4284548-792e-4eb0-84df-ce7fc1d9e409"}}}
2020-05-26T13:22:02.254+0200	INFO	[beat]	instance/beat.go:966	Build info	{"system_info": {"build": {"commit": "5e69e25b920e3d93bec76a09a31da3ab35a55607", "libbeat": "7.7.0", "time": "2020-05-12T00:48:56.000Z", "version": "7.7.0"}}}
2020-05-26T13:22:02.254+0200	INFO	[beat]	instance/beat.go:969	Go runtime info	{"system_info": {"go": {"os":"windows","arch":"amd64","max_procs":4,"version":"go1.13.9"}}}
2020-05-26T13:22:02.265+0200	INFO	[beat]	instance/beat.go:973	Host info	{"system_info": {"host": {"architecture":"x86_64","boot_time":"2020-05-18T04:54:26.39+02:00","name":"<redacted>","ip":["<redacted>","<redacted>","<redacted>","10.0.2.30/22","::1/128","127.0.0.1/8","<redacted>","<redacted>","<redacted>"],"kernel_version":"6.3.9600.19697 (winblue_ltsb.200411-0600)","mac":["00:50:56:07:14:c0","00:50:56:07:14:bf","00:00:00:00:00:00:00:e0","00:00:00:00:00:00:00:e0","00:00:00:00:00:00:00:e0"],"os":{"family":"windows","platform":"windows","name":"Windows Server 2012 R2 Standard","version":"6.3","major":3,"minor":0,"patch":0,"build":"9600.19701"},"timezone":"CEST","timezone_offset_sec":7200,"id":"460c64b3-811b-42da-990f-3ebd45af8a69"}}}
2020-05-26T13:22:02.278+0200	INFO	[beat]	instance/beat.go:1002	Process info	{"system_info": {"process": {"cwd": "C:\\windows\\system32", "exe": "C:\\Program Files\\AuditBeat\\AuditBeat.exe", "name": "AuditBeat.exe", "pid": 7404, "ppid": 540, "start_time": "2020-05-26T13:22:01.314+0200"}}}
2020-05-26T13:22:02.278+0200	INFO	instance/beat.go:297	Setup Beat: auditbeat; Version: 7.7.0
2020-05-26T13:22:02.278+0200	DEBUG	[beat]	instance/beat.go:323	Initializing output plugins
2020-05-26T13:22:02.279+0200	INFO	eslegclient/connection.go:84	elasticsearch url: https://<redacted>:443
2020-05-26T13:22:02.279+0200	DEBUG	[publisher]	pipeline/consumer.go:137	start pipeline event consumer
2020-05-26T13:22:02.279+0200	INFO	[publisher]	pipeline/module.go:110	Beat name: <redacted>
2020-05-26T13:22:02.279+0200	DEBUG	[modules]	beater/metricbeat.go:148	Available modules and metricsets: Register [ModuleFactory:[system], MetricSetFactory:[auditd/auditd, file_integrity/file, system/host, system/login, system/package, system/process, system/socket, system/user]]
2020-05-26T13:22:02.280+0200	DEBUG	[file_integrity]	file_integrity/metricset.go:99	Initialized the file event reader. Running as euid=-1
2020-05-26T13:22:02.292+0200	WARN	[cfgwarn]	host/host.go:167	BETA: The system/host dataset is beta
2020-05-26T13:22:02.297+0200	DEBUG	[system]	host/host.go:448	Restored last host information from disk.
2020-05-26T13:22:02.308+0200	WARN	[cfgwarn]	process/process.go:131	BETA: The system/process dataset is beta
2020-05-26T13:22:02.311+0200	DEBUG	[process]	process/process.go:168	Last state was sent at 2020-05-25 07:59:05.2643965 +0200 CEST. Next state update by 2020-05-25 19:59:05.2643965 +0200 CEST.
2020-05-26T13:22:02.311+0200	INFO	instance/beat.go:438	auditbeat start running.
2020-05-26T13:22:02.312+0200	DEBUG	[module]	module/wrapper.go:127	Starting Wrapper[name=file_integrity, len(metricSetWrappers)=1]
2020-05-26T13:22:02.312+0200	DEBUG	[module]	module/wrapper.go:127	Starting Wrapper[name=system, len(metricSetWrappers)=1]
2020-05-26T13:22:02.312+0200	DEBUG	[service]	service/service_windows.go:72	Windows is interactive: false
2020-05-26T13:22:02.312+0200	DEBUG	[module]	module/wrapper.go:181	file_integrity/file will start after 2.64675034s
2020-05-26T13:22:02.312+0200	DEBUG	[module]	module/wrapper.go:181	system/host will start after 7.662300006s
2020-05-26T13:22:02.312+0200	DEBUG	[module]	module/wrapper.go:127	Starting Wrapper[name=system, len(metricSetWrappers)=1]
2020-05-26T13:22:02.312+0200	DEBUG	[module]	module/wrapper.go:181	system/process will start after 1.550783582s

No explanation as to why the process stops it seems, this is all I see from the Service Launch event:

Any hint appreciated!

TIA

Hm launching the service by hand in an Administator CMD.exe works fine though...

We're seeing increasing reports of this problem all of a sudden, but it doesn't seem to be specific to 7.7.0. It's also happening to 7.6 and older versions.

Can you confirm that in the same system, 7.6.2 starts as a service but 7.7.0 doesn't?

Can you share the list of installed updates using Get-HotFix or similar tool?

Hm right reverting to v.7.6.2 still won't launch service, it seems Service last got [re]launched by Tanium on the 18th May.

It was latest patched at the 17th May with these patches:

<redacted>   Update           KB4552923     NT AUTHORITY\SYSTEM  17-05-2020...
<redacted>   Update           KB4552982     NT AUTHORITY\SYSTEM  17-05-2020...
<redacted>   Security Update  KB4556846     NT AUTHORITY\SYSTEM  17-05-2020...

and prior these in 2020:

<redacted>   Update           KB4532931     NT AUTHORITY\SYSTEM  20-01-2020...
<redacted>   Update           KB4532946     NT AUTHORITY\SYSTEM  20-01-2020...
<redacted>   Security Update  KB4540725     NT AUTHORITY\SYSTEM  11-03-2020...
<redacted>   Update           KB4534117     NT AUTHORITY\SYSTEM  16-03-2020...

Seems all our failing EndPoints but one (an older Windows Server 2008) to launch beat services are running 'Windows Server 2012 R2 Standard' but we also got multiple 2012 R2 EndPoints working with v.7.7.0, will sample HotFixes on a working EP...

@stefws Could it be related to a certain module / processor? What if you disable some?

@adrisr is there a Github issue for this?

These are the Hot Fixes from a working 2012 R2 Standard Server:

<redacted>    Security Update  KB4532961     NT AUTHORITY\SYSTEM  19-01-2020 00:00:00
<redacted>    Security Update  KB4534251     NT AUTHORITY\SYSTEM  19-01-2020 00:00:00
<redacted>    Security Update  KB4534309     NT AUTHORITY\SYSTEM  19-01-2020 00:00:00
<redacted>    Security Update  KB4537767     NT AUTHORITY\SYSTEM  16-02-2020 00:00:00
<redacted>    Security Update  KB4537803     NT AUTHORITY\SYSTEM  16-02-2020 00:00:00
<redacted>    Security Update  KB4540671     NT AUTHORITY\SYSTEM  15-03-2020 00:00:00
<redacted>    Security Update  KB4540725     NT AUTHORITY\SYSTEM  11-03-2020 00:00:00
<redacted>    Security Update  KB4541505     NT AUTHORITY\SYSTEM  15-03-2020 00:00:00
<redacted>    Security Update  KB4550905     NT AUTHORITY\SYSTEM  18-04-2020 00:00:00
<redacted>    Security Update  KB4550970     NT AUTHORITY\SYSTEM  18-04-2020 00:00:00
<redacted>    Security Update  KB4552966     NT AUTHORITY\SYSTEM  16-05-2020 00:00:00
<redacted>    Security Update  KB4556798     NT AUTHORITY\SYSTEM  16-05-2020 00:00:00
<redacted>    Security Update  KB4556853     NT AUTHORITY\SYSTEM  16-05-2020 00:00:00

It works if I run the service's cmd+args from the CLI, so I think not, also config of modules/processors are the same as last working... also beat' log at debug level shows nothing when service quits...

Weirdly, without changing anything it seems right now ~50% of the failing EPs was able to launch the beat service... could it have been a SCEP rule deeming binary dangerous...

On the Windows Server 2008 EP MetricBeat v.7.7.0 is running just fine, but both AuditBeat and WinlogBeat v.7.7.0 Windows claims isn't a valid win32 application if attempted to launch them from an Administrator CMD.exe...

Can you tell us about the status atm? Have they all started running? Also can you check what could have changed in those cases?

  • a system reboot, a new windows update, reinstalling the service, disabling any firewalls etc

Status atm are that now we've still got 35 failing out of previously 104 EPs.

Nothing have change ImHO, one specific previously failing hasn't been rebooted since last OS patch on the 17th May, it failed since 25th May when first patched to 7.7.0, then I tried reinstalling v.7.6.2, which still failed to launch, resinstalled v.7.7.0 that failing until suddenly this morning it launch while investigating it, but with out anything been changed. Then we attempted through Tanium to restart service where it weren't running and then ~50% worked initially and now after multiple reinstall/restart attempts we're down to 35 EPs still failing. Only thing changed since yesterday may have been SCEP and time ImHO. Am told from our Windows team that Windows sometimes are seen failing to launch services for no obvious reason... no wonder I ha.. M$ Windows :slight_smile:

Still wondering though why the single Windows Server 2008 Standard claims that Winlog+Audit beats are non-valid win32 apps while metrci beat just works fine.

Got our Tanium Server to attempt relaunch of not-running beat every hour, now it's down to 15x 2012R2 EPs and 1x 2008 EP. Expecting at least that time will solve the 15x 2012R2 EPs, not sure of the 2008 EP, got 2008R2 EPs working though.

Trending of EPs sending data to Elastic per hour:

Screenshot 2020-05-28 at 07.55.26

Almost there, now down to 2x 2012R2 and the one 2008 EPs not running :slight_smile:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.