Hi yago82,
Many thanks for your reply
I updated the agent policy in kibana choosing the working fleet server url. This policy update creates a new revision for the agent policy. The agent overview in kibana now shows me that the agents running with an outdated policy. Due to the fact that the elasticsearch url is properly configured for these agents they send monitoring and log data. Their current status is offline and they do not fetch the new policy revision. The agent log contains the following line:
{"log.level":"error","@timestamp":"<timestamp>,“log.origin":{"file.name":"fleet/fleet_gateway.go","file.line":197},"message":"Cannot checkin in with fleet-server, retrying","log":{"source":"elastic-agent"},"error":{"message":"fail to checkin to fleet-server: all hosts failed: 1 error occurred:\n\t* requester 0/1 to host https://<wrong fleet server url> errored: Post \"https://<wrong fleet server url>/api/fleet/agents/<agent id>/checkin?\": dial tcp 1<wrong fleet server url>: connect: no route to host\n\n"},"request_duration_ns":3079800091,"failed_checkins":806,"retry_after_ns":467875157778,"ecs.version":"1.6.0"}
I found out that I’am able to restart the elastic agent in the container via the command „elastic-agent restart“ without loosing the pod. Yeaah
But it seems that there is no configuration stored in the container including the fleet server url: #> find /usr/share/elastic-agent -name * -type f | xargs grep "“ /dev/null | more
Only the log files are containing the fleet server url. Running the command „elastic-agent status“ shows me:
┌─ fleet
│ └─ status: (FAILED) fail to checkin to fleet-server: all hosts failed: 1 error occurred:
│ * requester 0/1 to host https://<wrong fleet server url>/ errored: Post "https://<wrong fleet server url>/api/fleet/agents/<agent id>/checkin?": dial tcp <wrong fleet server url>: connect: no route to host
│
│
└─ elastic-agent
└─ status: (HEALTHY) Running
The command „elastic-agent inspect“ shows me:
fleet:
access_api_key: <api key>
agent:
id: <agent id>
enabled: true
host: localhost:5601
hosts:
- https://<wrong fleet server url>
protocol: http
ssl:
renegotiation: never
verification_mode: none
timeout: 10m0s
So I tried to follow your advice putting the fleet server url into the file /usr/share/elastic-agent/elastic-agent.yml. But I could not figure out the correct configuration from the documentation. Can you provide me a correct configuration snippet? Maybe just as in my snippet above? Like
fleet:
hosts:
- https://<correct fleet server url>
Some further information regarding my deployment. As explained on Run Elastic Agent on Kubernetes managed by Fleet | Fleet and Elastic Agent Guide [8.11] | Elastic I used the file elastic-agent-managed-kubernetes.yaml to setup my daemon set. The following environment variables are set:
env:
- name: FLEET_ENROLL
value: "1"
- name: FLEET_INSECURE
value: "true"
- name: FLEET_URL
value: "https://<correct fleet server url>"
- name: FLEET_ENROLLMENT_TOKEN
value: "<enrollment token agent policy>“
- name: KIBANA_HOST
value: "http://kibana:5601"
- name: KIBANA_FLEET_USERNAME
value: "elastic"
- name: KIBANA_FLEET_PASSWORD
value: "changeme"
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: ELASTIC_NETINFO
value: "false"
It’s currently very frustrating. Hopefully someone has an idea how to fix it. Thanks in advance.