ELastic Defend agent high latency on DCs

We're seeing extreme latency on 2022 Domain controllers when Elastic Defend 8.6.2 Malicious Behavior rules are enabled. The server becomes very sluggish but performance metrics don't appear to show any sign of excesses load. Low CPU, Low Memory Usage, Low/normal network load, and low Disk activity. This seems to affect all our DC's but severity depends on how busy each server is. The most active will jump to from ping latency of .5ms to 1000+ in short order.

Since we don't see any perfmon counts that seem to indicate an issue all we have to go on is the servers responsiveness. Using icmp uptime monitors we can quickly see when the problem starts, and stops. The issue is extra hard to determine since a low load DC may not show the issue at all most of the time, but a high load one, maybe be fine for a few minutes before it becomes erratic.

Any thoughts on possible causes or methods to narrow this down further would be greatly appreciated.

TIA

We've found these errors on some of the DC's. And a few other systems.

YaraLib.cpp:94 Yara rule (Windows_Trojan_CobaltStrike_8519072e) compile warning: string "$a4" may slow down scanning

Not sure if they are related to the issue or not, but there seems to be a good chance it is.

Hi @Kelly_Slavens. The Discuss forum software normally emails me when Endpoint Security issues are posted. I'm sorry I don't know why I didn't get one for this issue. Please feel free to hop into the #endpoint-security room in our community Slack. You can usually get a response there pretty quickly during normal business hours.

First, thank you for isolating the performance issue to a specific policy toggle. That's a huge time-saver.

Would you mind sending us a copy of your diagnostics so we can see what's set in policy besides that behavioral protection checkbox? I created a secure upload link here specific to your case. You can collect diagnostics like this:

C:\Temp>"C:\Program Files\Elastic\Agent\elastic-agent.exe" diagnostics
Created diagnostics archive "elastic-agent-diagnostics-2023-04-24T16-16-03Z-00.zip"

The malicious behavior protection system requires a variety of event types to function. Even if those events aren't configured to stream to Elasticsearch, the Endpoint still need to collect and enrich them to make them available to the behavioral rules engine. Event sources can be forcefully disabled using advanced policy options. While these advanced switches are useful for troubleshooting, they may or may not be an ideal long-term solution because some data sources are used for features besides events and behavioral protection. You can try setting these and seeing if performance improves, either one-by-one or binary search (divide and conquer) like git bisect does.

  • advanced.kernel.filewrite: false
  • advanced.kernel.network: false
  • advanced.kernel.fileopen: false
  • advanced.kernel.asyncimageload: false
  • advanced.kernel.syncimageload: false
  • advanced.kernel.registry: false
  • advanced.kernel.process_handle: false
  • advanced.kernel.fileaccess: false
  • advanced.kernel.registryaccess: false
  • advanced.kernel.image_and_process_file_timestamps: false

You can find advanced settings at the bottom of the Defend integration policy page:

If you find a combination that works for you, please let us know, and please don't hesitate to reach out to us on Slack.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.