Elastic agent shows healthy (Also no error messages in Logs) in Kibana but fails to send data to elastic search

v 7.17.2 - ELK

Elastic Agent for EndPoint security gets enrolled and installed successfully.

But instead of showing data received from elastic agent , it always goes to the attached screenshot "
Next step: Enroll an Agent with Endpoint Security" Screen in Kibana with "no data streams" available in Kibana-> Fleet screen ( http://x.y.z:5601/app/fleet/data-streams )

Please help me resolve this one and only issue in an otherwise smooth elastic installation.?

**ubuntu@siezefire** : **~** $ sudo elastic-agent enroll -f --url=[https://192.168](https://192.0.0.168/).0.110:8220 --fleet-server-es=[https://192.168](https://192.0.0.168/).0.110:9200 --fleet-server-service-token=abc --fleet-server-policy=efg  --certificate-authorities=/etc/elasticsearch/ca/ca.crt --fleet-server-es-ca=/etc/elasticsearch/ca/ca.crt --fleet-server-cert=/etc/elasticsearch/R.crt --fleet-server-cert-key=/etc/elasticsearch/K.key

2022-04-16T19:45:51.865Z INFO cmd/enroll_cmd.go:776 Fleet Server - Stopping

2022-04-16T19:45:53.873Z INFO cmd/enroll_cmd.go:776 Fleet Server - Stopped

2022-04-16T19:45:57.877Z INFO cmd/enroll_cmd.go:776 Fleet Server - Starting

2022-04-16T19:46:05.881Z INFO cmd/enroll_cmd.go:757 Fleet Server - Running on policy with Fleet Server integration: abzzzzz; **missing config** [fleet.agent.id](http://fleet.agent.id/) (expected during bootstrap process)

2022-04-16T19:46:06.548Z INFO cmd/enroll_cmd.go:454 Starting enrollment to URL: [https://192.168](https://192.0.0.1.xx:8220/

2022-04-16T19:46:07.458Z INFO cmd/enroll_cmd.go:254 Successfully triggered restart on running Elastic Agent.

Successfully enrolled the Elastic Agent.

**ubuntu@hostz** : **~** $
**sudo elastic-agent status**
Status: HEALTHY
Message: (no message)
Applications:
  * filebeat_monitoring    (HEALTHY)
                           Running
  * metricbeat_monitoring  (HEALTHY)
                           Running
  * endpoint-security      (HEALTHY)
                           Protecting with policy {abczzzz}
  * fleet-server           (HEALTHY)
                           Running on policy with Fleet Server integration: xyzsssssss

PS : Auditbeat agent successfully sends data to Kibana and Elasticsearch .

Hi @ajj31

Thanks for checking out Endpoint Security. Sorry that its been a few days before you got a reply.

This situation implies that Endpoint is not able to write to Elasticsearch.

It's confusing that Endpoint is reporting its status as HEALTHY since things are not working as expected. We'll have to improve that. However, since Endpoint's status is HEALTHY it does mean that Endpoint is successfully doing what is in the policy configuration. In other words, even though data isn't available in Elasticsearch, Endpoint is protecting the computer and once this networking issue is fixed Endpoint's cache of unwritten documents will be flushed to Elasticsearch.

The first step to triage this issue is to look in Endpoint's logs. I recommend searching for the log message "Elasticsearch connection is down" then looking at messages just before to see why the connection is not working. Most likely it is a configuration issue (e.g. CA certificates, etc).

Logs on disk are located in c:\Program Files\Elastic\Endpoint\state\log (Windows), /opt/Elastic/Endpoint/state/log (Linux), and /Library/Elastic/Endpoint/state/log/ (macOS). Since it looks like Agent and Filebeat are in good working order, with default settings they will also appear in the Agent page in Fleet and the Observability app in Kibana, and if you change the Agent log level in Fleet (e.g. to debug) Endpoint will also start logging at that level.

Endpoint makes a few network connections. One is to Fleet Server (which appears to be working for you), one to Elasticsearch (which is not working), and one to receive protection updates from Elastic (from artifacts.security.elastic.co which may or may not be working for you, but either way is not required and unrelated to your issue). So make sure when looking at log messages around a failed logs that they appear to be for the Elasticsearch connection.

@ajj31 in addition to @ferullo suggestion to look at the Endpoint, you can also quickly determine if there is any Endpoint related data shipping to ES by looking at the Data Streams management section.

"no data streams" available in Kibana

Based off the above comment, you may have already checked this, but just to make sure we check everything, go to "Stack Management -> Index Management -> Data Streams" and filter the list by "endpoint" like in the screenshot below. You should see several data streams with logs-endpoint* and metrics-endpoint* prefixes. If you do not see any relevant data streams, then the Endpoint logs may have more info as @ferullo suggests.

I should have mentioned before, another option is to run the command elastic-endpoint.exe test output (Endpoint's executable is in the root of Endpoint's install directory where logs are found). That command should give a summary of why Endpoint cannot write to Elasticsearch. Make sure to run the command as Administrator/root.

Hello @ferullo

The error in the elastic-agent logs said " connection to 9220 was refused " even though Kibana was successfully connecting to Elasticsearch from the same system on the same port.

Error

default/fleet-server-json.log-2022-05-01-20-16:{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","new":{"Host":"0.0.0.0","Port":8220,"InternalPort":8221,"TLS":{"Enabled":null,"VerificationMode":"full","Versions":null,"CipherSuites":null,"CAs":null,"Certificate":{"Certificate":"/etc/elasticsearch/abc.crt","Key":"[redacted]","Passphrase":""},"CurveTypes":null,"ClientAuth":0},"Timeouts":{"Read":60000000000,"Write":600000000000,"Idle":30000000000,"ReadHeader":5000000000,"CheckinTimestamp":30000000000,"CheckinLongPoll":300000000000,"CheckinJitter":30000000000},"Profiler":{"Enabled":false,"Bind":"localhost:6060"},"CompressionLevel":1,"CompressionThresh":1024,"Limits":{"PolicyThrottle":5000000,"MaxHeaderByteSize":8192,"MaxConnections":20000,"CheckinLimit":{"Interval":1000000,"Burst":2000,"Max":10001,"MaxBody":1048576},"ArtifactLimit":{"Interval":1000000,"Burst":2000,"Max":4000,"MaxBody":0},"EnrollLimit":{"Interval":10000000,"Burst":100,"Max":200,"MaxBody":524288},"AckLimit":{"Interval":1000000,"Burst":2000,"Max":4000,"MaxBody":2097152}},"Runtime":{"GCPercent":0},"Bulk":{"FlushInterval":250000000,"FlushThresholdCount":2048,"FlushThresholdSize":1048576,"FlushMaxPending":8}},"@timestamp":"2022-05-01T20:43:32.66Z","message":"initial server configuration"}
default/fleet-server-json.log-2022-05-01-20-16:{"log.level":"error","service.name":"fleet-server","service.name":"fleet-server","cluster.addr":["siezefire.siezeconsulting.com:9200"],"cluster.user":"","cluster.maxConnsPersHost":128,"error.message":"dial tcp 192.168.0.x:9200: connect: connection refused","@timestamp":"2022-05-01T20:43:32.667Z","message":"fail elasticsearch info"}

The enrolment and subsequent installation of elastic-agent did not indicate any error associated with the certificate or Elasticsearch issue as indicated by the successful enrolment logs below.

root@siezefire:/etc/elasticsearch# sudo elastic-agent enroll --url=https://siezefire.siezeconsulting.com:8220   --fleet-server-es=https://siezefire.siezeconsulting.com:9200   --fleet-server-service-token=zuzuz   --fleet-server-policy=xxxxxxx   --certificate-authorities=/etc/elasticsearch/ca/ca.crt   --fleet-server-es-ca=/etc/elasticsearch/ca/ca.crt   --fleet-server-cert=/etc/elasticsearch/abc-fleet-server/abc-fleet-server.crt   --fleet-server-cert-key=/etc/elasticsearch/abc-fleet-server/abc-fleet-server.key

2022-05-01T15:59:50.024Z INFO cmd/enroll_cmd.go:571 Spawning Elastic Agent daemon as a subprocess to complete bootstrap process.

2022-05-01T15:59:50.273Z INFO application/application.go:67 Detecting execution mode

2022-05-01T15:59:50.278Z INFO application/application.go:88 Agent is in Fleet Server bootstrap mode

2022-05-01T15:59:51.033Z INFO cmd/enroll_cmd.go:743 Waiting for Elastic Agent to start Fleet Server

2022-05-01T15:59:51.089Z INFO [api] api/server.go:62 Starting stats endpoint

2022-05-01T15:59:51.089Z INFO application/fleet_server_bootstrap.go:130 Agent is starting

2022-05-01T15:59:51.089Z INFO [api] api/server.go:64 Metrics endpoint listening on: /var/lib/elastic-agent/data/tmp/elastic-agent.sock (configured: unix:///var/lib/elastic-agent/data/tmp/elastic-agent.sock)

2022-05-01T15:59:51.092Z INFO application/fleet_server_bootstrap.go:140 Agent is stopped

2022-05-01T15:59:51.098Z INFO stateresolver/stateresolver.go:48 New State ID is C8D3jRMm

2022-05-01T15:59:51.098Z INFO stateresolver/stateresolver.go:49 Converging state requires execution of 1 step(s)

2022-05-01T15:59:54.250Z INFO log/reporter.go:40 2022-05-01T15:59:54Z - message: Application: fleet-server--7.17.3[]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'

2022-05-01T15:59:54.253Z INFO stateresolver/stateresolver.go:66 Updating internal state

2022-05-01T15:59:55.834Z WARN status/reporter.go:236 Elastic Agent status changed to: 'degraded'

2022-05-01T15:59:55.834Z INFO log/reporter.go:40 2022-05-01T15:59:55Z - message: Application: fleet-server--7.17.3[]: State changed to DEGRADED: Running on policy with Fleet Server integration: 499b5aa7-d214-5b5d-838b-3cd76469844e; missing config fleet.agent.id (expected during bootstrap process) - type: 'STATE' - sub_type: 'RUNNING'

2022-05-01T15:59:57.040Z INFO cmd/enroll_cmd.go:757 Fleet Server - Running on policy with Fleet Server integration: 499b5aa7-d214-5b5d-838b-3cd76469844e; missing config fleet.agent.id (expected during bootstrap process)

2022-05-01T15:59:57.080Z INFO cmd/enroll_cmd.go:454 Starting enrollment to URL: https://siezefire.siezeconsulting.com:8220/

2022-05-01T15:59:58.421Z INFO cmd/enroll_cmd.go:258 Elastic Agent has been enrolled; start Elastic Agent

Successfully enrolled the Elastic Agent.

2022-05-01T15:59:58.421Z INFO cmd/run.go:184 Shutting down Elastic Agent and sending last events...

2022-05-01T15:59:58.422Z INFO operation/operator.go:216 waiting for installer of pipeline 'default' to finish

2022-05-01T15:59:58.422Z INFO process/app.go:176 Signaling application to stop because of shutdown: fleet-server--7.17.3

2022-05-01T15:59:59.923Z INFO status/reporter.go:236 Elastic Agent status changed to: 'online'

2022-05-01T15:59:59.923Z INFO log/reporter.go:40 2022-05-01T15:59:59Z - message: Application: fleet-server--7.17.3[]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'

2022-05-01T15:59:59.924Z INFO cmd/run.go:192 Shutting down completed.

2022-05-01T15:59:59.924Z INFO [api] api/server.go:66 Stats endpoint (/var/lib/elastic-agent/data/tmp/elastic-agent.sock) finished: accept unix /var/lib/elastic-agent/data/tmp/elastic-agent.sock: use of closed network connection

root@siezefire:/etc/elasticsearch#

The fleet.yml was automatically generated as part of the enrolment process , should I be manually changing anything here....so that the elastic-agent can connect to elastic search with a certificate (currently not seen in fleet.yml) ? 

fleet.yml

root@siezefire:/etc/elastic-agent# cat fleet.yml
agent:
id: abc
monitoring.http:
enabled: false
host: ""
port: 6791
fleet:
enabled: true
access_api_key: xyxxxx
protocol: https
host: abc.xyz:8220
ssl:
verification_mode: ""
certificate_authorities:
- /etc/Elasticsearch/ca/ca.crt
renegotiation: never
timeout: 10m0s
proxy_disable: true
reporting:
threshold: 10000
check_frequency_sec: 30
agent:
id: ""
server:
policy:
id: yyyyyyyyyy
output:
Elasticsearch:
protocol: https
hosts:
- abc.xyz.com:9200
service_token: xxxxxxx
ssl:
verification_mode: ""
certificate_authorities:
- /etc/Elasticsearch/ca/ca.crt
renegotiation: never
proxy_disable: false
proxy_headers: {}
host: 0.0.0.0
port: 8220
internal_port: 8221
ssl:
verification_mode: ""
certificate: /etc/Elasticsearch/siezeconsulting-fleet-server/siezeconsulting-fleet-server.crt
key: /etc/Elasticsearch/siezeconsulting-fleet-server/siezeconsulting-fleet-server.key
renegotiation: never
root@siezefire:/etc/elastic-agent#


The elastic-agent.yml has nothing significant to represent given how must of the configuration to enrol was passed from the command above

elastic-agent.yml
-----------------------
root@siezefire:/etc/elastic-agent# cat elastic-agent.yml
# ================================ General =====================================
# Beats is configured under Fleet, you can define most settings
# from the Kibana UI. You can update this file to configure the settings that
# are not supported by Fleet.
fleet:
  enabled: true

# agent.download:
#   # source of the artifacts, requires elastic like structure and naming of the binaries
#   # e.g /windows-x86.zip
#   sourceURI: "https://artifacts.elastic.co/downloads/beats/"
#   # path to the directory containing downloaded packages
#   target_directory: "${path.data}/downloads"
#   # timeout for downloading package
#   timeout: 120s
#   # file path to a public key used for verifying downloaded artifacts
#   # if not file is present Elastic Agent will try to load public key from elastic.co website.
#   pgpfile: "${path.data}/elastic.pgp"
#   # install_path describes the location of installed packages/programs. It is also used
#   # for reading program specifications.
#   install_path: "${path.data}/install"

# agent.process:
#   # minimal port number for spawned processes
#   min_port: 10000
#   # maximum port number for spawned processes
#   max_port: 30000
#   # timeout for creating new processes. when process is not successfully created by this timeout
#   # start operation is considered a failure
#   spawn_timeout: 30s

# agent.retry:
#   # enabled determines whether retry is possible. Default is false.
#   enabled: true
#   # retries_count specifies number of retries. Default is 3.
#   # Retry count of 1 means it will be retried one time after one failure.
#   retries_count: 3
#   # delay specifies delay in ms between retries. Default is 30s
#   delay: 30s
#   # max_delay specifies maximum delay in ms between retries. Default is 300s
#   max_delay: 5m
#   # Exponential determines whether delay is treated as exponential.
#   # With 30s delay and 3 retries: 30, 60, 120s
#   # Default is false
#   exponential: false

Let me know if I can upload the output  of " elastic-agent diagnostics collect" to a location you suggest ?




Also one other related observation/question i have . The Fleet server with "Default fleet server" policy gets enrolled successfully.

But then I try to enrol another end point security agent using "Default policy" , then the elastic-agent throws the following question .

This will replace your current settings. Do you want to continue? [Y/n]:n

Can't I enrol multiple agents after fleet-server is enrolled and installed ?