Endpoint 7.9 "Degraded and dashboards"

Brand new Windows 10 LTSC machine not even patched. Agent version 7.9.2 new policy named "For_You_Ferullo" with 1 agent named TESTBOX and Endpoint enabled. Minikatz downloaded and of course Chrome hates it and flags it so you have to allow. Windows defender disabled just to avoid it steeping in.

"Artifacts.cpp:2298 Failed to download artifact endpoint-exceptionlist-windows-v1 - Failure in an external software component"

{"@timestamp":"2020-10-02T00:22:41.78407700Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"error","origin":{"file":{"line":629,"name":"SyncKernelMessageManager.cpp"}}},"message":"SyncKernelMessageManager.cpp:629 Process ID 1608: [C:\mimikatz_trunk\x64\mimikatz.exe] is allowed due to message processing failure, error code -205","process":{"pid":8188,"thread":{"id":8548}}}
{"@timestamp":"2020-10-02T00:22:41.78407700Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"error","origin":{"file":{"line":629,"name":"SyncKernelMessageManager.cpp"}}},"message":"SyncKernelMessageManager.cpp:629 Process ID 1608: [C:\mimikatz_trunk\x64\mimikatz.exe] is allowed due to message processing failure, error code -205","process":{"pid":8188,"thread":{"id":8548}}}
{"@timestamp":"2020-10-02T00:22:41.72155100Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":746,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:746 Sending alert for [C:\mimikatz_trunk\x64\mimikatz.exe]","process":{"pid":8188,"thread":{"id":9056}}}
{"@timestamp":"2020-10-02T00:22:41.72155100Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":746,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:746 Sending alert for [C:\mimikatz_trunk\x64\mimikatz.exe]","process":{"pid":8188,"thread":{"id":9056}}}
{"@timestamp":"2020-10-02T00:23:11.79709900Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"error","origin":{"file":{"line":629,"name":"SyncKernelMessageManager.cpp"}}},"message":"SyncKernelMessageManager.cpp:629 Process ID 8616: [C:\mimikatz_trunk\x64\mimidrv.sys] is allowed due to message processing failure, error code -205","process":{"pid":8188,"thread":{"id":9056}}}
{"@timestamp":"2020-10-02T00:23:11.79709900Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"warning","origin":{"file":{"line":1047,"name":"Authenticode.cpp"}}},"message":"Authenticode.cpp:1047 WinVerifyTrust returned: 800b0101, errorExpired (C:\mimikatz_trunk\x64\mimidrv.sys)","process":{"pid":8188,"thread":{"id":4028}}}
{"@timestamp":"2020-10-02T00:23:11.95328800Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":746,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:746 Sending alert for [C:\mimikatz_trunk\x64\mimidrv.sys]","process":{"pid":8188,"thread":{"id":8548}}}
{"@timestamp":"2020-10-02T00:23:48.13958800Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"error","origin":{"file":{"line":629,"name":"SyncKernelMessageManager.cpp"}}},"message":"SyncKernelMessageManager.cpp:629 Process ID 8616: [C:\mimikatz_trunk\Win32\mimikatz.exe] is allowed due to message processing failure, error code -205","process":{"pid":8188,"thread":{"id":9056}}}
{"@timestamp":"2020-10-02T00:23:48.13958800Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"error","origin":{"file":{"line":629,"name":"SyncKernelMessageManager.cpp"}}},"message":"SyncKernelMessageManager.cpp:629 Process ID 8616: [C:\mimikatz_trunk\Win32\mimikatz.exe] is allowed due to message processing failure, error code -205","process":{"pid":8188,"thread":{"id":9056}}}
{"@timestamp":"2020-10-02T00:23:48.73626800Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":746,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:746 Sending alert for [C:\mimikatz_trunk\Win32\mimikatz.exe]","process":{"pid":8188,"thread":{"id":8548}}}
{"@timestamp":"2020-10-02T00:23:48.73626800Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":746,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:746 Sending alert for [C:\mimikatz_trunk\Win32\mimikatz.exe]","process":{"pid":8188,"thread":{"id":8548}}}

Free text search for mimikatz Kibana in the logs-* ends with 0 results which is already known as I'm not the only one missing the endpoint malware logs. I did a joke search for cute cat with no results.

0 alerts are triggered but that's expected as nothing for endpoint is sent. Filebeat and Metric beat logs are received with 0 issues. What I do find interesting is mimikatz process was killed and the exe deleted. Looking over the defender logs I did not see that it was the one that stopped it. So maybe 7.9.2 did. The only reference I have in any log in the lines above. Now to fix endpoint-exceptionlist-windows-v1 and the lack of any malware notices in Elastic.

It looks like Endpoint is in fact the one that prevented Mimikatz! In particular, this is the log that shows Endpoint prevented Mimikatz from running since Endpoint wouldn't send an alert unless it had prevented it.

{"@timestamp":"2020-10-02T00:22:41.72155100Z","agent":{"id":"5b1ac9c4-f401-4ad6-9586-6b7c8c124b05","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":746,"name":"FileScore.cpp"}}},"message":"FileScore.cpp:746 Sending alert for [C:\mimikatz_trunk\x64\mimikatz.exe]","process":{"pid":8188,"thread":{"id":9056}}}

Endpoint alerts are written to the index logs-endpoint.alerts-default. Default is the namespace, if you've changed this in Ingest Manager 7.9.0, 7.9.1, and 7.9.2 Endpoints won't recognize that change and will still send to the default namespace. A fix is in the works for that bug.

Do you see the alert in that index? If not, check Endpoint logs to see if it is able to communicate with Elasticsearch. Details how to do that are in an earlier comment in this thread.

If you do see the alert the issue is likely that you need to enable alert detection rules in the Security App. To do so go to "Security" -> "Detections" then click on "Manage detection rules". I recommend clicking "Load prebuild detection rules and timeline templates" but if you don't make sure the "Elastic Endpoint Security" rule is enabled. After doing that, try generating an alert again.

Hope this helps!

Sorry been a little delayed on testing. Been rather busy. I still haven't had the time to get another fresh machine with python on it follow your steps. I will end up doing it soon I want to make sure it's not the API key preventing anything.

I do not have an index logs-endpoint.alerts-default that is useable at this time. I do see the templet for it. To be honest I don't expect to see it yet with the other issue of the agent not talking.

I know you awesome dev's have an update in the works mostly around the endpoint not being able to communicate with TLS. All of my clusters run TLS on them even the stand along test node due to the SIEM not starting without it.

I have not changed anything with the index. I leave the defaults and use them when needed. You guys have way more time and skill to do this I'll defer to your wisdom on that part :slight_smile:

@ferullo Going to mark your response as the solution. Never could get the API key to work but with the SSL errors being known it's better to wait then chase my tail anymore.

I repeated the minikatz test on another machine and it failed. It was able to run without being stopped or the file being deleted.

Manually changing the elastic-endpoint config.yml and fleet.yml to hard path of the CA cert resolved the issues.

Not elegant nor scalable nor permanent as each time you issue an update it's overwritten but at least its repeatable.

I'm glad you found a solution, though I agree it's not a good lasting solution. Can you share the config snippet you modified (sanitizing anything that is personal information, of course)

elastic-endpoint.yaml:
fleet:
agent:
id: client_id_goes_here
api:
access_api_key: API_Key_goes_here
kibana:
host: kibana_server_name_goes_here:5601
protocol: https
ssl:
certificate_authorities:
- C:\Program Files\Elastic\Endpoint\ca.crt
renegotiation: never
verification_mode: full
timeout: 1m30s

fleet.yml
ssl:
verification_mode: full
certificate_authorities:
- C:\Program Files\Elastic\Endpoint\ca.crt
renegotiation: never

After testing on a few machines up to 10 currently. It's not consistent. Some work some don't. Agent version still 7.9.2 mind you. It's FAR better then before as I'm not seeing the disconnect notice or degrade nearly as often. It's now more consistent with successful messages.

After 4 hours in the endpoint logs on the client the disconnect messages start to reappear and degraded start start's to reappear so afraid it was short lived. This has only been tried on a single cluster but seeing as the same CA is used for both it wouldn't make any difference.

It would be awesome if we could use the built in cert stores on Windows and Linux for the CA lookup as in larger networks that would scale and allow for cert lookup and invalidation. Do believe this has already been requested and in the works on Github so just saying it to say it for the people that read the forums.

@ferullo

7.10 = Works. Thank you for your hard work!

The only comments I have is on the Fleet tab now "former injects manager" have a reminder to set the elasticsearch host as it's localhost by default and then in the YAML entry spot have a link to the syntax that's shows the supported options for output.

That's great! I'm glad to hear 7.10 was a smoother roll out for you.

I've relayed your comment internally about the localhost configuration workflow.