Fleet - selfhosted production mode 7.16.3

Hello wonderful people,

I'm looking into setting up fleet. So far I have a test server ready and I have generated the certificates using Elasticsearch-certutil

Elasticsearch CA

/etc/Elasticsearch/certs/elastic-stack-ca.p12

Elasticsearch certificate

/etc/Elasticsearch/certs/elastic-certificates.p12

# Kibana certs

# EXPORT CA Certificate
mkdir /etc/kibana/certs
openssl pkcs12 -in /etc/elasticsearch/certs/elastic-stack-ca.p12 -clcerts -nokeys -chain -out /etc/kibana/certs/ca.pem
chmod 640 /etc/kibana/certs/ca.pem

# EXPORT Cert and Key for Kibana
/usr/share/elasticsearch/bin/elasticsearch-certutil cert --pem --ca /etc/elasticsearch/certs/elastic-stack-ca.p12 --dns localhost,"$cluster_name" --ip 127.0.0.1,"$ip_address" --out /etc/kibana/certs/kibana.zip

unzip -j /etc/kibana/certs/kibana.zip -d /etc/kibana/certs/

After unzipping I have the following:

# cert authority
ca.pem  
# server certificate
instance.crt  
# server certificate key
instance.key

Would I be able to use the certificates above for fleet?

I have tried using the following command to start the fleet server but it fails

sudo elastic-agent enroll --url=https://192.168.131.155:8200 --fleet-server-es=https://192.168.131.155:9200 --fleet-server-service-token=hidden --fleet-server-policy=hidden --certificate-authorities=/etc/kibana/ca.pem --fleet-server-es-ca=/etc/kibana/ca.pem --fleet-server-cert=/etc/kibana/certs/instance.crt --fleet-server-cert-key=/etc/kibana/certs/instance.key
This will replace your current settings. Do you want to continue? [Y/n]:y
2022-01-14T19:21:49.758+1100    INFO    cmd/enroll_cmd.go:571   Spawning Elastic Agent daemon as a subprocess to complete bootstrap process.
2022-01-14T19:21:49.910+1100    INFO    application/application.go:67   Detecting execution mode
2022-01-14T19:21:49.911+1100    INFO    application/application.go:88   Agent is in Fleet Server bootstrap mode
2022-01-14T19:21:50.034+1100    INFO    [api]   api/server.go:62        Starting stats endpoint
2022-01-14T19:21:50.034+1100    INFO    application/fleet_server_bootstrap.go:130       Agent is starting
2022-01-14T19:21:50.034+1100    INFO    [api]   api/server.go:64        Metrics endpoint listening on: /var/lib/elastic-agent/data/tmp/elastic-agent.sock (configured: unix:///var/lib/elastic-agent/data/tmp/elastic-agent.sock)
2022-01-14T19:21:50.036+1100    INFO    application/fleet_server_bootstrap.go:140       Agent is stopped
2022-01-14T19:21:50.038+1100    INFO    stateresolver/stateresolver.go:48       New State ID is HqKuFOSV
2022-01-14T19:21:50.038+1100    INFO    stateresolver/stateresolver.go:49       Converging state requires execution of 1 step(s)
2022-01-14T19:21:50.080+1100    INFO    operation/operator.go:284       operation 'operation-install' skipped for fleet-server.7.16.3
2022-01-14T19:21:50.357+1100    INFO    log/reporter.go:40      2022-01-14T19:21:50+11:00 - message: Application: fleet-server--7.16.3[]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
2022-01-14T19:21:50.357+1100    INFO    stateresolver/stateresolver.go:66       Updating internal state
2022-01-14T19:21:50.760+1100    INFO    cmd/enroll_cmd.go:776   Fleet Server - Starting
2022-01-14T19:21:50.890+1100    INFO    log/reporter.go:40      2022-01-14T19:21:50+11:00 - message: Application: fleet-server--7.16.3[]: State changed to RESTARTING: exited with code: 1 - type: 'STATE' - sub_type: 'STARTING'
2022-01-14T19:21:50.890+1100    INFO    log/reporter.go:40      2022-01-14T19:21:50+11:00 - message: Application: fleet-server--7.16.3[]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
2022-01-14T19:21:50.890+1100    INFO    log/reporter.go:40      2022-01-14T19:21:50+11:00 - message: Application: fleet-server--7.16.3[]: State changed to RESTARTING: Restarting - type: 'STATE' - sub_type: 'STARTING'
2022-01-14T19:21:51.417+1100    INFO    log/reporter.go:40      2022-01-14T19:21:51+11:00 - message: Application: fleet-server--7.16.3[]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
2022-01-14T19:22:55.048+1100    WARN    status/reporter.go:236  Elastic Agent status changed to: 'degraded'
2022-01-14T19:22:55.048+1100    INFO    log/reporter.go:40      2022-01-14T19:22:55+11:00 - message: Application: fleet-server--7.16.3[]: State changed to DEGRADED: Missed last check-in - type: 'STATE' - sub_type: 'RUNNING'
2022-01-14T19:23:49.760+1100    INFO    cmd/run.go:184  Shutting down Elastic Agent and sending last events...
2022-01-14T19:23:49.761+1100    INFO    operation/operator.go:216       waiting for installer of pipeline 'default' to finish
2022-01-14T19:23:49.761+1100    INFO    process/app.go:176      Signaling application to stop because of shutdown: fleet-server--7.16.3
2022-01-14T19:23:55.056+1100    ERROR   status/reporter.go:236  Elastic Agent status changed to: 'error'
2022-01-14T19:23:55.056+1100    ERROR   log/reporter.go:36      2022-01-14T19:23:55+11:00 - message: Application: fleet-server--7.16.3[]: State changed to FAILED: Missed two check-ins - type: 'ERROR' - sub_type: 'FAILED'

2022-01-14T19:24:20.801+1100    INFO    status/reporter.go:236  Elastic Agent status changed to: 'online'
2022-01-14T19:24:20.801+1100    INFO    log/reporter.go:40      2022-01-14T19:24:20+11:00 - message: Application: fleet-server--7.16.3[]: State changed to STOPPED: Stopped - type: 'STATE' - sub_type: 'STOPPED'
2022-01-14T19:24:20.801+1100    INFO    cmd/run.go:192  Shutting down completed.
2022-01-14T19:24:20.801+1100    INFO    [api]   api/server.go:66        Stats endpoint (/var/lib/elastic-agent/data/tmp/elastic-agent.sock) finished: accept unix /var/lib/elastic-agent/data/tmp/elastic-agent.sock: use of closed network connection
Error: fleet-server failed: context canceled
For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/7.16/fleet-troubleshooting.html

Thank you

I keep getting "missing config fleet.agent.id (expected during bootstrap process)" error

I started from scratch on another server and same problem :confused:

2022-01-15T04:04:05.219Z        INFO    cmd/enroll_cmd.go:571   Spawning Elastic Agent daemon as a subprocess to complete bootstrap process.
2022-01-15T04:04:05.418Z        INFO    application/application.go:67   Detecting execution mode
2022-01-15T04:04:05.420Z        INFO    application/application.go:88   Agent is in Fleet Server bootstrap mode
2022-01-15T04:04:05.797Z        INFO    [api]   api/server.go:62        Starting stats endpoint
2022-01-15T04:04:05.799Z        INFO    application/fleet_server_bootstrap.go:130       Agent is starting
2022-01-15T04:04:05.799Z        INFO    [api]   api/server.go:64        Metrics endpoint listening on: /var/lib/elastic-agent/data/tmp/elastic-agent.sock (configured: unix:///var/lib/elastic-agent/data/tmp/elastic-agent.sock)
2022-01-15T04:04:05.800Z        INFO    application/fleet_server_bootstrap.go:140       Agent is stopped
2022-01-15T04:04:05.806Z        INFO    stateresolver/stateresolver.go:48       New State ID is -kd1O2qe
2022-01-15T04:04:05.806Z        INFO    stateresolver/stateresolver.go:49       Converging state requires execution of 1 step(s)
2022-01-15T04:04:05.896Z        INFO    operation/operator.go:284       operation 'operation-install' skipped for fleet-server.7.16.3
2022-01-15T04:04:06.177Z        INFO    log/reporter.go:40      2022-01-15T04:04:06Z - message: Application: fleet-server--7.16.3[]: State changed to STARTING: Starting - type: 'STATE' - sub_type: 'STARTING'
2022-01-15T04:04:06.179Z        INFO    stateresolver/stateresolver.go:66       Updating internal state
2022-01-15T04:04:06.231Z        INFO    cmd/enroll_cmd.go:776   Fleet Server - Starting
2022-01-15T04:04:07.826Z        WARN    status/reporter.go:236  Elastic Agent status changed to: 'degraded'
2022-01-15T04:04:07.826Z        INFO    log/reporter.go:40      2022-01-15T04:04:07Z - message: Application: fleet-server--7.16.3[]: State changed to DEGRADED: Running on policy with Fleet Server integration: 499b5aa7-d214-5b5d-838b-3cd76469844e; missing config fleet.agent.id (expected during bootstrap process) - type: 'STATE' - sub_type: 'RUNNING'
2022-01-15T04:04:08.233Z        INFO    cmd/enroll_cmd.go:757   Fleet Server - Running on policy with Fleet Server integration: 499b5aa7-d214-5b5d-838b-3cd76469844e; missing config fleet.agent.id (expected during bootstrap process)
2022-01-15T04:04:08.775Z        INFO    cmd/enroll_cmd.go:454   Starting enrollment to URL: https://192.168.131.104:8200/
2022-01-15T04:04:08.882Z        WARN    cmd/enroll_cmd.go:465   Remote server is not ready to accept connections, will retry in a moment.

I can see that I'm missing the id in fleet.yml but the guide doesn't say anything about this

fleet.yml

/etc/elastic-agent# cat fleet.yml
agent:
  id: ""
  monitoring.http:
    enabled: false
    host: ""
    port: 6791
fleet:
  enabled: true
  access_api_key: ""
  protocol: http
  host: localhost:5601
  timeout: 10m0s
  proxy_disable: true
  reporting:
    threshold: 10000
    check_frequency_sec: 30
  agent:
    id: ""
  server:
    bootstrap: true
    policy:
      id: HIDDEN
    output:
      elasticsearch:
        protocol: https
        hosts:
        - 192.168.131.104:9200
        service_token: HIDDEN
        ssl:
          verification_mode: full
          certificate_authorities:
          - /etc/elastic-agent/certs/cert.crt
          renegotiation: never
        proxy_disable: false
        proxy_headers: {}
    host: 0.0.0.0
    port: 8220
    internal_port: 8221
    ssl:
      verification_mode: full
      certificate: /etc/elastic-agent/certs/fleet-server/fleet-server.crt
      key: /etc/elastic-agent/certs/fleet-server/fleet-server.key
      renegotiation: never

Hi @VamPikmin,

The enroll command you ran here specified port 8200. Fleet runs on port 8220. This is likely why it failed.
Did you make a similar mistake in your second attempt?

@MichelLaterman
That's spot on. Since then I spun up a third server and fleet-server is up and running