Elastic Endpoint Security with Elastic Agent

Hey,

I installed the Elastic agent on a linux endpoints (ubuntu) to test how it works and the difference between managing this versus managing regular beats. Everything worked great.

Then, I wanted to test the Elastic Endpoint Security and deploy it with the agent. Everything seemed to work fine. The config shows 2 integrations: system and elastic endpoint security.

The problem is that when I go Security - Administration, I don't see any host running Endpoint Security. Instead, it asks me to enroll agents.

If I go in Fleet, then click on a host, I see a flood of logs that seem to be issues that should already have been resolved in 7.9.1:

Application: endpoint-security--7.9.1[1523b070-f248-4129-a9d9-4404d863b63b]: State changed to RUNNING: 
Application: endpoint-security--7.9.1[1523b070-f248-4129-a9d9-4404d863b63b]: State changed to DEGRADED: Missed last check-in
...

I have that pair of logs around every 20min.

In Datasets, I don't have any *.endpoint or anything similar.

Am I doing something wrong in order to use Elastic Endpoint Security with the Elastic agent?

Thanks in advance for your help!

jsu

ps: I couldn't tag this post with elastic-agent, no idea why.

Hi @jsu

Sorry that you're running in to these issues. Could you check the Elastic Endpoint logs and see if you see anything that might indicate an issue? You should be able to find those logs here: /opt/Elastic/Endpoint/state/log/

@jsu You're not going crazy or alone in this. This is common currently so far Windows, Mac and Linux "CentOS 8" all do the same thing. "Multi none linked cluster setup same results"

Seems to be a random missed check in. If you check the endpoint logs you should see it saying the kibana instance your trying to connect to is offline at that time. It's clearly not offline as other agents will check in at the time. It is far better in 7.9.1 then it was in 7.9.0.

Security - Administrator I get the same thing not sure this is the intended function...

Endpoint is in a rough state in 7.9.x but it's getting better...

The logs are under logs-elastic.agent-* and metrics-elastic.agent.* with the metrics having one for endpoint. It makes about zero logical sense when your looking for it it based on the integration name. I looked at is as Windows would be under Windows and Cisco would be under Cisco so maybe Endpoint would be under Endpoint. Nope...

Hi @NickFritts, Hi @PublicName,

Thanks for your feedback, I didn't expect the logs to be there..

So here are the 2 lines that are flooding the logs:

{"@timestamp":"2020-09-25T08:42:03.511947228Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1392,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1392 Establishing GET connection to [https://elastic_server:9200/_cluster/health]","process":{"pid":134998,"thread":{"id":135003}}}
{"@timestamp":"2020-09-25T08:42:03.538649631Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"notice","origin":{"file":{"line":65,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:65 Elasticsearch connection is down","process":{"pid":134998,"thread":{"id":135003}}}

Would it seem to be the same known issue that will be fixed in another version or am I doing anything wrong here?

@jsu
Same issues is still present in 7.9.2. If you check the time stamps you should see in the fleet manager the agent your pulling from at the same time will have a degraded message. At least it lines up for me that way.

Indeed I just upgraded our stack to 7.9.2 and I still have the DEGRADED messages.

But now I don't even have any data (metrics) coming in anymore even though I also upgraded the elastic-agent on all boxes. So I guess we'll have to wait until 7.9.3 :slight_smile:

EDIT: it took a while, but the metrics are back. Data from Endpoint Security still missing.

Hi @jsu. Sorry you're hitting these problems. It seems like you may have two different issues.

The DEGRADED issue
Could you look in Endpoint's logs around the time a DEGRADED message appears in the Agent Activity Log to see if there is any indication what Endpoint was doing when this happened. Please make sure to adjust for any time zone issues you might have between the timestamps in Endpoint's logs and those in Kibana (if there are any).

In particular, I'm curious of the following:

  1. Is Endpoint is actively applying Policy or did it just apply Policy (you should see lots of logs with the string "found in config" in them). The bug that was fixed in 7.9.1 was caused by Endpoint being slow to keep it's heartbeat with Agent when applying Policy.
  2. Is Endpoint is crashing. Endpoint's PID is in each log message so if it suddenly changes that is an indication it may be crashing. If the PID changes it could also be Elastic Agent restarting/reinstalling Endpoint if it feels Endpoint is very unhealthy. Systemd's syslogs would also show if Endpoint is crashing.
  3. Does Endpoint think it has an active connection to Agent when it becomes DEGRADED or does it know it is disconnected. Check the Endpont logs (/opt/Elastic/Endpoint/state/log/) to see it's status. Below are two commands I used locally.
vagrant@ubuntu:~$ sudo bash -c "grep 'AgentConnectionInfo.cpp' /opt/Elastic/Endpoint/state/log/*"
{"@timestamp":"2020-09-29T20:54:50.968161593Z","agent":{"id":"b5fe037e-4bc6-4644-94cf-3122c490db20","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":110,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:110 Validated agent is root/admin","process":{"pid":5989,"thread":{"id":6000}}}
{"@timestamp":"2020-09-29T20:54:50.973433891Z","agent":{"id":"b5fe037e-4bc6-4644-94cf-3122c490db20","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":118,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:118 Established stage 1 connection to agent","process":{"pid":5989,"thread":{"id":6000}}}
{"@timestamp":"2020-09-29T21:45:14.989756853Z","agent":{"id":"b5fe037e-4bc6-4644-94cf-3122c490db20","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":110,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:110 Validated agent is root/admin","process":{"pid":5989,"thread":{"id":6000}}}
{"@timestamp":"2020-09-29T21:45:14.990314390Z","agent":{"id":"b5fe037e-4bc6-4644-94cf-3122c490db20","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":118,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:118 Established stage 1 connection to agent","process":{"pid":5989,"thread":{"id":6000}}}
vagrant@ubuntu:~$ sudo bash -c "grep 'Agent connection' /opt/Elastic/Endpoint/state/log/*"
{"@timestamp":"2020-09-29T20:55:11.982879236Z","agent":{"id":"b5fe037e-4bc6-4644-94cf-3122c490db20","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":592,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:592 Agent connection established.","process":{"pid":5989,"thread":{"id":6000}}}
{"@timestamp":"2020-09-29T21:45:36.5766533Z","agent":{"id":"b5fe037e-4bc6-4644-94cf-3122c490db20","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":592,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:592 Agent connection established.","process":{"pid":5989,"thread":{"id":6000}}}
vagrant@ubuntu:~$ 

The Elasticsearch connection is down issue
Can you follow the steps here Endpoint 7.9 "Degraded and dashboards" to see if you are able to connect to Elasticsearch with Endpoint's config information via Curl. There is no need to check the Kibana connection, in 7.9 Linux Endpoints do not have any reason to connect to Kibana.

1 Like

Hi @ferullo, thanks a lot for your answer!

So, concerning your second point that I just tested:

  • the original curl command gives me an error because of the self signed certificate
  • adding -k gives me the error missing authentication credentials for REST request (401)
  • adding -u <my_user> and entering its password gives me the same error (401)

I'll check the first part of your answer and come back as soon as I have something.

So, concerning the DEGRADED issue:

  1. Here are the logs around the timestamp when I had a DEGRADED message:
{"@timestamp":"2020-09-30T09:00:04.44128021Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing GET connection to [https://<elasticsearch_node>:9200/_cluster/health]","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                    
{"@timestamp":"2020-09-30T09:00:04.72186299Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"notice","origin":{"file":{"line":65,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:65 Elasticsearch connection is down","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                                                  
{"@timestamp":"2020-09-30T09:00:04.123138026Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"warning","origin":{"file":{"line":446,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:446 Failed to read ActionRequest.","process":{"pid":244854,"thread":{"id":263068}}}                                                                                                                                                                                                               
{"@timestamp":"2020-09-30T09:00:04.124573172Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":266,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:266 Agent state stream is closed. Stopping state reading.","process":{"pid":244854,"thread":{"id":263067}}}                                                                                                                                                                                          
{"@timestamp":"2020-09-30T09:00:05.124542339Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":479,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:479 Attempting to reestablish Agent actions stream.","process":{"pid":244854,"thread":{"id":263068}}}                                                                                                                                                                                                
{"@timestamp":"2020-09-30T09:00:05.125059651Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":360,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:360 Attempting to reestablish Agent check-in stream.","process":{"pid":244854,"thread":{"id":263067}}}                                                                                                                                                                                               
{"@timestamp":"2020-09-30T09:00:09.81063867Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing GET connection to [https://<elasticsearch_node>:9200/_cluster/health]","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                    
{"@timestamp":"2020-09-30T09:00:09.109686360Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"notice","origin":{"file":{"line":65,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:65 Elasticsearch connection is down","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                                                 
{"@timestamp":"2020-09-30T09:00:09.174425088Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":110,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:110 Validated agent is root/admin","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                                                                                
{"@timestamp":"2020-09-30T09:00:09.174728928Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":118,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:118 Established stage 1 connection to agent","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                                                                      
{"@timestamp":"2020-09-30T09:00:10.188804197Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":565,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:565 Connecting to Agent.","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                                                                                                           
{"@timestamp":"2020-09-30T09:00:14.119281356Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing GET connection to [https://<elasticsearch_node>:9200/_cluster/health]","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                   
{"@timestamp":"2020-09-30T09:00:14.153836592Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"notice","origin":{"file":{"line":65,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:65 Elasticsearch connection is down","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                   
{"@timestamp":"2020-09-30T09:00:29.268475875Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"notice","origin":{"file":{"line":65,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:65 Elasticsearch connection is down","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                                                 
{"@timestamp":"2020-09-30T09:00:31.191574993Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":592,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:592 Agent connection established.","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                                                                                                  
{"@timestamp":"2020-09-30T09:00:34.274866210Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":1440,"name":"HttpLib.cpp"}}},"message":"HttpLib.cpp:1440 Establishing GET connection to [https://<elasticsearch_node>:9200/_cluster/health]","process":{"pid":244854,"thread":{"id":244891}}}                                                                                                                                                                   
{"@timestamp":"2020-09-30T09:00:34.301478832Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"notice","origin":{"file":{"line":65,"name":"BulkQueueConsumer.cpp"}}},"message":"BulkQueueConsumer.cpp:65 Elasticsearch connection is down","process":{"pid":244854,"thread":{"id":244891}}}

I'm showing the few first and last lines to show that they flood the logs.

So, no "found in config" in here.

edit: the rest of the post is below because of the character limit

  1. The PID stays the same so it doesn't seem to be crashing.

  2. Here are the results of the 2 commands:

  • sudo bash -c "grep 'AgentConnectionInfo.cpp' /opt/Elastic/Endpoint/state/log/*"
/opt/Elastic/Endpoint/state/log/endpoint-000003.log:{"@timestamp":"2020-09-29T20:27:23.527144233Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":118,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:118 Established stage 1 connection to agent","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                   
/opt/Elastic/Endpoint/state/log/endpoint-000003.log:{"@timestamp":"2020-09-29T20:42:46.553220799Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":110,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:110 Validated agent is root/admin","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                             
/opt/Elastic/Endpoint/state/log/endpoint-000003.log:{"@timestamp":"2020-09-29T20:42:46.553474893Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":118,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:118 Established stage 1 connection to agent","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                   
/opt/Elastic/Endpoint/state/log/endpoint-000003.log:{"@timestamp":"2020-09-29T21:03:09.582278239Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":110,"name":"AgentConnectionInfo.cpp"}}},"message":"AgentConnectionInfo.cpp:110 Validated agent is root/admin","process":{"pid":244854,"thread":{"id":244897}}} 
  • sudo bash -c "grep 'Agent connection' /opt/Elastic/Endpoint/state/log/*"
/opt/Elastic/Endpoint/state/log/endpoint-000003.log:{"@timestamp":"2020-09-30T00:37:23.561971191Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":592,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:592 Agent connection established.","process":{"pid":244854,"thread":{"id":244897}}}                                                                                                                                                              
/opt/Elastic/Endpoint/state/log/endpoint-000003.log:{"@timestamp":"2020-09-30T00:57:49.584601956Z","agent":{"id":"31e817fc-6fcf-4bd4-8d29-e3e5206e2c46","type":"endpoint"},"ecs":{"version":"1.5.0"},"log":{"level":"info","origin":{"file":{"line":592,"name":"AgentComms.cpp"}}},"message":"AgentComms.cpp:592 Agent connection established.","process":{"pid":244854,"thread":{"id":244897}}}

So if I'm understanding correctly, this is a security issue, right?

If so, what I have a hard time to understand is the fact that it can't connect while the other agent's integrations can.

Thanks already for all the help!

Thank you for the logs on the DEGRADED issue. The Failed to read ActionRequest line is interesting to us, we're diving into Endpoint code to see if we can figure out what could cause that and what next steps to take to debug this for you. As I understand it, this issue resolves itself and Endpoint goes back into a RUNNING state, correct?

For the Elasticsearch connection issue, it looks like need to add your self signed certificates root CA to the local store on you hosts. Endpoint uses the system root CA authorities to validate the TLS connection with Elasticsearch. This issue tracks a feature being added to Kibana so local CA stores don't need to be updated.

Thanks for the feedback.

Blockquote As I understand it, this issue resolves itself and Endpoint goes back into a RUNNING state, correct?

This seems correct indeed.

Blockquote For the Elasticsearch connection issue, it looks like need to add your self signed certificates root CA to the local store on you hosts. Endpoint uses the system root CA authorities to validate the TLS connection with Elasticsearch. This issue tracks a feature being added to Kibana so local CA stores don't need to be updated.

Well, from what I understand, I already did that in order to make the elastic-agent work - at least the metrics. Hence why I'm wondering why some parts of the agent work while others (endpoint) don't.

Ah, thanks for explaining. This seems like a bug in Endpoint. It should be checking the location you added the certificate to for Agent. Can you share the instructions you used to get Agent working and the location you installed the certificate to so we can make sure this works for Endpoint too in a future release.

In the meantime, this thread has instructions to help you get Endpoint connected. In particular, this comment describes how to add the certificate to a place Endponit will currently look.

1 Like

Thanks for your answer. So I followed the documentation to install the agent and then I deployed Endpoint with Fleet.

At first, I had the same issue than here but with all integrations of the agent: my ca.pem was in some custom directory under /etc/ and, while it was fine for the old Beats, it wasn't working with the agent. I moved the file to /etc/ssl/certs/ and restarted elastic-agent service and it worked fine.

So, now concerning Endpoint, thanks to your link I managed to make it work, but not exactly in the same way (fyi, I'm talking about Ubuntu 20.04):

  • convert the .pem CA in .crt and copied it to /usr/share/ca-certificates
  • update-ca-certificates wasn't enough, I needed to sudo dpkg-reconfigure ca-certificates
  • restarting the service wasn't enough, a reboot was needed

And now it's working.

If I understand correctly, it's not the agent that handles the certificates, but every integration itself, right?

Thanks anyway for your help, really nice and I'm looking forward to seeing the evolution of the agent and endpoint in particular :slight_smile:

Cheers!

Thanks for the reply. I'm glad you got it all working!

Correct. Endpoint runs as its own executable with its own service. Agent manages installing/uninstalling/configuring Endpoint but under the hood they are different binaries. Since each integration communicates directly with Elasticsearch, each needs to handle certificates independently.

1 Like

Ok great, thanks for the explanation!