Endpoint 7.9 "Degraded and dashboards"

ferullo · September 29, 2020, 9:56pm

Sorry it's been a few days without a reply.

Given that it seems like you're having connectivity issues, let's work through your networking. There are two connections we need to validate for Endpoint. For both, API key authentication takes precedence over username/password authentication if both are in Endpoint's config.

Given that you hit errors in the past it's best to start with a fresh Agent and Endpoint install if possible so there is less in the Endpoint logs to go through. It would be helpful to know which connection is not working and what errors you're seeing.

Note that in the example commands below some specifics, like the API keys and URLs, will of course be different for you. Also not all of the commands below are a part of a native Windows installation. Hopefully if the exact commands don't work for you you'll be able to figure out some variant of them that works on your computer; if not just ping back and we'll find a different command together.

Connection to Kibana
Endpoint connects to Kibana to download potentially large artifacts it needs to fully apply the policy. For example, for 7.9 this is how Endpoint downloads the Alert Exceptions to apply on macOS and Windows.

In Endpoint's config (c:\Program Files\Elastic\Endpoint\elastic-endpoint.yaml) you should see a snippet that looks like this:

fleet:
  api:
    access_api_key: BASE64VALUE
    kibana:
      host: example.com
      protocol: https
inputs:
  - artifact_manifest:
      artifacts:
        endpoint-exceptionlist-windows-v1:
          relative_url: /api/endpoint/artifacts/download/endpoint-exceptionlist-windows-v1/d801aa1fb7ddcc330a5e3173372ea6af4a3d08ec58074478e85aa5603e926658

Based on that you can search Endpoint's logs for the relative_url to see what happened when Endpoint tried to download the artifact. On my machine these are the logs I see.

C:\WINDOWS\system32>grep endpoint-exceptionlist-windows-v1 "c:\Program Files\Elastic\Endpoint\state\log\endpoint-000000.log"
{"@timestamp":"2020-09-29T21:26:10.70243100Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":2241,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:2241 Downloading artifact: endpoint-exceptionlist-windows-v1","process":{"pid":9832,"thread":{"id":2232}}}
{"@timestamp":"2020-09-29T21:26:10.70243100Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing GET connection to [https://example.com:443/api/endpoint/artifacts/download/endpoint-exceptionlist-windows-v1/d801aa1fb7ddcc330a5e3173372ea6af4a3d08ec58074478e85aa5603e926658]","process":{"pid":9832,"thread":{"id":2232}}}
{"@timestamp":"2020-09-29T21:26:10.32287000Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":497,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:497 Artifact endpoint-exceptionlist-windows-v1 successfully verified","process":{"pid":9832,"thread":{"id":2232}}}

Further, you can use Curl to manually try to download the same artifact. Make sure to pipe the value to something like xxd since the content downloaded isn't text.

C:\WINDOWS\system32>curl -H "Authorization: ApiKey BASE64VALUE" https://example.com:443/api/endpoint/artifacts/download/endpoint-exceptionlist-windows-v1/d801aa1fb7ddcc330a5e3173372ea6af4a3d08ec58074478e85aa5603e926658 | xxd
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    22  100    22    0     0     22      0  0:00:01 --:--:--  0:00:01    41
00000000: 789c ab56 4acd 2b29 ca4c 2d56 b28a 8ead  x..VJ.+).L-V....
00000010: 0500 2719 0529                           ..'..)

C:\WINDOWS\system32>

Connection to Elasticsearch
Endpoint connects to Elasticsearch to store data that it generates.

In Endpoint's config file you should see a snippet that looks like this:

output:
  elasticsearch:
    api_key: raw:value
    hosts:
      - https://example.com:443

Based on that you can search Endpoint's logs to see what happens when it checks to see if it can send to Elasticsearch. If after checking the cluster health it sends data to the _bulk API then it is able to send data.

C:\WINDOWS\system32>grep -A 1 "_cluster/health" "c:\Program Files\Elastic\Endpoint\state\log\endpoint-000000.log"
{"@timestamp":"2020-09-29T21:26:16.38473600Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing GET connection to [https://example.com:443/_cluster/health]","process":{"pid":9832,"thread":{"id":9352}}}
{"@timestamp":"2020-09-29T21:26:16.45341500Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing POST connection to [https://example.com:443/_bulk]","process":{"pid":9832,"thread":{"id":9352}}}

C:\WINDOWS\system32>

You can also search for "documents to Elasticsearch" to see how many documents Endpoint is periodically sending.

C:\WINDOWS\system32>grep "documents to Elasticsearch" "c:\Program Files\Elastic\Endpoint\state\log\endpoint-000000.log" | head -n 4
{"@timestamp":"2020-09-29T21:26:17.55295400Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":180,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:180 Sent 8 documents to Elasticsearch","process":{"pid":9832,"thread":{"id":9352}}}
{"@timestamp":"2020-09-29T21:28:11.49117600Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":180,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:180 Sent 1 documents to Elasticsearch","process":{"pid":9832,"thread":{"id":9352}}}
{"@timestamp":"2020-09-29T21:28:13.63456900Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":180,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:180 Sent 227 documents to Elasticsearch","process":{"pid":9832,"thread":{"id":9352}}}
{"@timestamp":"2020-09-29T21:30:11.2557800Z","agent":{"id":"4b707d92-f692-4d70-9251-fa99fa06435c","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":180,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:180 Sent 1 documents to Elasticsearch","process":{"pid":9832,"thread":{"id":9352}}}

C:\WINDOWS\system32>

From the configuration file snippet you can also generate a Curl request to see what happens when you manually try to connect to Elasticsearch. Notice that before using Curl you must base 64 encode the api_key value.

C:\WINDOWS\system32>python3 -c "import base64; print(base64.b64encode('raw:value'.encode('utf-8')))"
b'cmF3OnZhbHVl'

C:\WINDOWS\system32>curl -H "Authorization: ApiKey cmF3OnZhbHVl" https://example.com:443/_cluster/health
{"cluster_name":"0e0111df93d141b8997a992f385d2aa8","status":"green","timed_out":false,"number_of_nodes":3,"number_of_data_nodes":2,"active_primary_shards":42,"active_shards":84,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}
C:\WINDOWS\system32>

Topic		Replies	Views
Endpoint-security State changed to DEGRADED: Protecting with policy Endpoint Security	5	3046	September 14, 2021
Elastic Endpoint Security with Elastic Agent Endpoint Security	16	3054	November 10, 2020
Issue with Endpoint agent Elastic Security elastic-stack-monitoring , elastic-stack-security	7	1500	November 4, 2022
[again] Endpoint security immediately degraded Endpoint Security	9	2172	December 27, 2022
Endpoint Security DEGRADED, Malware failed to enable due to potential system deadlock Endpoint Security	8	1774	September 14, 2021

Endpoint 7.9 "Degraded and dashboards"

Related topics