Facing TLS handshake error in Fleet server elastic agent logs

Hi all,

I have configured a Fleet server on my EC2. All setup is working fine, but currently, I am facing a TLS handshake error in the Fleet server elastic agent logs.

{"log.level":"error","service.name":"fleet-server","service.name":"fleet-server","message":"http: TLS handshake error from 10.197.25.15:8680: EOF\n","@timestamp":"2024-03-28T12:57:42.983Z"}
{"log.level":"error","service.name":"fleet-server","service.name":"fleet-server","message":"http: TLS handshake error from 10.197.26.34:14108: EOF\n","@timestamp":"2024-03-28T12:57:45.105Z"}
{"log.level":"error","service.name":"fleet-server","service.name":"fleet-server","message":"http: TLS handshake error from 10.197.24.59:53776: EOF\n","@timestamp":"2024-03-28T12:57:52.643Z"}
{"log.level":"error","service.name":"fleet-server","service.name":"fleet-server","message":"http: TLS handshake error from 10.197.25.15:5539: EOF\n","@timestamp":"2024-03-28T12:57:52.982Z"}
{"log.level":"error","service.name":"fleet-server","service.name":"fleet-server","message":"http: TLS handshake error from 10.197.26.34:27452: EOF\n","@timestamp":"2024-03-28T12:57:55.105Z"}

In these logs, the IPs 10.197.25.15, 10.197.26.34, 10.197.24.59 are AWS ALB private IPs. Could you please help me understand how to fix this issue? I believe this may be causing my APM service to not show the latest data.

Hi Shayan_Ahmed, welcome to the community!

Can you share the config of your fleet-server/agent? What is the config of your agent policy, and Fleet server host, certificates?
Do you have APM integration installed?

Hi, thanks for your response. Below are the fleet and APM configuration and settings files. I have installed SSL and APM integrations.

Fleet config file

agent:
  id: 37f18aba-24a6-43f4-b2d1-504d711b4223
  monitoring.http:
    enabled: false
    host: ""
    port: 6791
fleet:
  enabled: true
  access_api_key: MWdtSmVvNEI3S0NjTXA0WVhQT3U6UldZQUxROThTcGFJRXZRRU5FNzNOdw==
  protocol: https
  host: ip-10-197-25-230.grpn-logging-stable.us-west-2xxxxxxxxx:8220
  ssl:
    verification_mode: full
    certificate_authorities:
    - /etc/pki/ca-trust/source/anchors/xxxxxRootCA.pem
    renegotiation: never
  timeout: 10m0s
  proxy_disable: true
  reporting:
    threshold: 10000
    check_frequency_sec: 30
  agent:
    id: ""
  server:
    policy:
      id: 499b5aa7-d214-5b5d-838b-3cd76469844e
    output:
      elasticsearch:
        protocol: https
        hosts:
        - logging-stable-data01.grpn-logging-stable.us-west-2.xxx.com:9200
        service_token: AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE3MTE0NDExODU2Mjg6TFJBOGlGR0NSSnFJbVZCRDE5VVVsUQ
        ssl:
          verification_mode: full
          certificate_authorities:
          - /etc/pki/ca-trust/source/anchors/xxxxxxxRootCA.pem
          renegotiation: never
        proxy_disable: false
        proxy_headers: {}
    host: 0.0.0.0
    port: 8220
    internal_port: 8221
    ssl:
      verification_mode: full
      certificate: /var/certs/cert-bundle.pem
      key: /var/certs/privkey.pem
      renegotiation: never

Agent Config file

agent:
  id: 68bd3dec-9d8e-4479-8e63-d107728dd63e
  logging.level: info
  monitoring.http:
    enabled: false
    host: ""
    port: 6791
fleet:
  enabled: true
  access_api_key: MXdtZWVvNEI3S0NjTXA0WW5QTng6SkZOWDljRFlUYXVhTVh0eDBwbC1jQQ==
  protocol: http
  host: logging-stable-data01.grpn-logging-stable.us-west-2.xxxxxxxxx.com:8220
  hosts:
  - https://logging-stable-data01.grpn-logging-stable.us-west-2.xxxxxx.com:8220
  ssl:
    verification_mode: full
    renegotiation: never
  timeout: 10m0s
  reporting:
    threshold: 10000
    check_frequency_sec: 30
  agent:
    id: ""

Hi

I would be very grateful if you could review and respond to my query.

Hello,

We have the exact same logs.

After investigation, these logs occurs when there is a healtheck from the loadbalancer, in reality from the target group because we use AWS. Is it the same for you ?

I understood that the error means that a call in HTTP is made instead of HTTPS. Thus, I changed the healthcheck to use HTTPS protocol (instead of TCP before) and now there is no more errors :slight_smile:

But, I didn't not find a route that return HTTP code 200, thus I set verification path to / and http success return code to 404.
Is there a route that return 200 ?
EDIT: I found it, this is /api/status :slight_smile: Because of Allow HTTP to /api/status even when HTTPS is configured · Issue #1567 · elastic/fleet-server · GitHub