Hey Everyone,
I'm having some trouble getting the Kubernetes integration to fully work on my ECK cluster.
What I'm trying to do is get elastic-agent to read container logs and forward them to ES.
Here's my elastic-agent.yaml:
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: elastic-agent
namespace: default
spec:
version: 8.12.0
kibanaRef:
name: kibana
fleetServerRef:
name: fleet-server
mode: fleet
policyID: eck-agent
daemonSet:
podTemplate:
spec:
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
serviceAccountName: elastic-agent
hostNetwork: true
hostPID: true
dnsPolicy: ClusterFirstWithHostNet
automountServiceAccountToken: true
securityContext:
runAsUser: 0
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: elastic-agent
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: default
roleRef:
kind: ClusterRole
name: elastic-agent
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: default
name: elastic-agent
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: default
roleRef:
kind: Role
name: elastic-agent
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: elastic-agent-kubeadm-config
namespace: default
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: default
roleRef:
kind: Role
name: elastic-agent-kubeadm-config
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: elastic-agent
labels:
k8s-app: elastic-agent
rules:
- apiGroups: [""]
resources:
- nodes
- namespaces
- events
- pods
- services
- configmaps
- serviceaccounts
- persistentvolumes
- persistentvolumeclaims
verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
resources:
- replicasets
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources:
- statefulsets
- deployments
- replicasets
- daemonsets
verbs: ["get", "list", "watch"]
- apiGroups:
- ""
resources:
- nodes/stats
verbs:
- get
- apiGroups: [ "batch" ]
resources:
- jobs
- cronjobs
verbs: [ "get", "list", "watch" ]
- nonResourceURLs:
- "/metrics"
verbs:
- get
- apiGroups: ["rbac.authorization.k8s.io"]
resources:
- clusterrolebindings
- clusterroles
- rolebindings
- roles
verbs: ["get", "list", "watch"]
- apiGroups: ["policy"]
resources:
- podsecuritypolicies
verbs: ["get", "list", "watch"]
- apiGroups: [ "storage.k8s.io" ]
resources:
- storageclasses
verbs: [ "get", "list", "watch" ]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: elastic-agent
namespace: default
labels:
k8s-app: elastic-agent
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: elastic-agent-kubeadm-config
namespace: default
labels:
k8s-app: elastic-agent
rules:
- apiGroups: [""]
resources:
- configmaps
resourceNames:
- kubeadm-config
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: elastic-agent
namespace: default
labels:
k8s-app: elastic-agent
I've tried messing around with volumeMounts
but when trying to specify them alongside volumes
I get the following error:
Warning ReconciliationError 42m (x23 over 108m) agent-controller Reconciliation error: DaemonSet.apps "elastic-agent-agent" is invalid: spec.template.spec.containers[0].image: Required value
The volumes and volumeMounts can be found here . The issue is that it's referenced from these docs , and in these docs elastic-agent is run as a DaemonSet directly, not through the Agent CRD that comes with ECK operator.
Any help is greatly apprecited!
Cheers,
Luka
Here's the yaml I was working with that threw the error above
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: elastic-agent
namespace: default
spec:
version: 8.12.0
kibanaRef:
name: kibana
fleetServerRef:
name: fleet-server
mode: fleet
policyID: eck-agent
daemonSet:
podTemplate:
spec:
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
serviceAccountName: elastic-agent
hostNetwork: true
hostPID: true
dnsPolicy: ClusterFirstWithHostNet
automountServiceAccountToken: true
securityContext:
runAsUser: 0
containers:
- name: elastic-agent
volumeMounts:
- name: proc
mountPath: /hostfs/proc
readOnly: true
- name: cgroup
mountPath: /hostfs/sys/fs/cgroup
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
- name: etc-full
mountPath: /hostfs/etc
readOnly: true
- name: var-lib
mountPath: /hostfs/var/lib
readOnly: true
- name: etc-mid
mountPath: /etc/machine-id
readOnly: true
- name: sys-kernel-debug
mountPath: /sys/kernel/debug
- name: elastic-agent-state
mountPath: /usr/share/elastic-agent/state
volumes:
- name: agent-data
emptyDir: {}
- name: proc
hostPath:
path: /proc
- name: cgroup
hostPath:
path: /sys/fs/cgroup
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
- name: etc-full
hostPath:
path: /etc
- name: var-lib
hostPath:
path: /var/lib
- name: etc-mid
hostPath:
path: /etc/machine-id
type: File
- name: sys-kernel-debug
hostPath:
path: /sys/kernel/debug
- name: elastic-agent-state
hostPath:
path: /var/lib/elastic-agent-managed/default/state
type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: elastic-agent
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: default
roleRef:
kind: ClusterRole
name: elastic-agent
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: default
name: elastic-agent
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: default
roleRef:
kind: Role
name: elastic-agent
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: elastic-agent-kubeadm-config
namespace: default
subjects:
- kind: ServiceAccount
name: elastic-agent
namespace: default
roleRef:
kind: Role
name: elastic-agent-kubeadm-config
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: elastic-agent
labels:
k8s-app: elastic-agent
rules:
- apiGroups: [""]
resources:
- nodes
- namespaces
- events
- pods
- services
- configmaps
- serviceaccounts
- persistentvolumes
- persistentvolumeclaims
verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
resources:
- replicasets
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources:
- statefulsets
- deployments
- replicasets
- daemonsets
verbs: ["get", "list", "watch"]
- apiGroups:
- ""
resources:
- nodes/stats
verbs:
- get
- apiGroups: [ "batch" ]
resources:
- jobs
- cronjobs
verbs: [ "get", "list", "watch" ]
- nonResourceURLs:
- "/metrics"
verbs:
- get
- apiGroups: ["rbac.authorization.k8s.io"]
resources:
- clusterrolebindings
- clusterroles
- rolebindings
- roles
verbs: ["get", "list", "watch"]
- apiGroups: ["policy"]
resources:
- podsecuritypolicies
verbs: ["get", "list", "watch"]
- apiGroups: [ "storage.k8s.io" ]
resources:
- storageclasses
verbs: [ "get", "list", "watch" ]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: elastic-agent
namespace: default
labels:
k8s-app: elastic-agent
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: elastic-agent-kubeadm-config
namespace: default
labels:
k8s-app: elastic-agent
rules:
- apiGroups: [""]
resources:
- configmaps
resourceNames:
- kubeadm-config
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: elastic-agent
namespace: default
labels:
k8s-app: elastic-agent
The reference I used was from this post where he hasn't specified the image but apparently it works.
EDIT - even when copying this example I get the same error
One more thing I've noticed is when applying a new configuration it recreates the pods, and all data is lost (aka there's a massive amount of Offline agents in the fleet UI). Even though by default data should persist across restarts:
kubectl describe po elastic-agent-agent-97mjg
Volumes:
agent-data:
Type: HostPath (bare host directory volume)
Path: /var/lib/elastic-agent/default/elastic-agent/state
HostPathType: DirectoryOrCreate
There's also a massive amount of these errors (for every module):
{"log.level":"error","@timestamp":"2024-02-09T13:57:49.055Z","message":"Failed to list light metricsets for module postgresql: getting metricsets for module 'postgresql': loading light module 'postgresql' definition: loading module configuration from '/usr/share/elastic-agent/data/elastic-agent-3c8be7/components/module/postgresql/module.yml': config file (\"/usr/share/elastic-agent/data/elastic-agent-3c8be7/components/module/postgresql/module.yml\") must be owned by the user identifier (uid=0) or root","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log":{"source":"system/metrics-default"},"log.origin":{"file.line":145,"file.name":"mb/lightmodules.go"},"service.name":"metricbeat","ecs.version":"1.6.0","log.logger":"registry.lightmodules","ecs.version":"1.6.0"}
Seems to be related to this issue . The module files are indeed owned by elastic-agent
while the container itself is running as root
. Can't find any errors related to why it doesn't remember it's state though. HostPath seems to be mounted accordingly and it's present on the nodes:
[root@k8s-master01 ~]# ll /var/lib/elastic-agent/default/elastic-agent/state
total 0
drwxr-x---. 5 root root 75 Feb 9 15:43 data
Managed to get it working. Posting here in case anyone needs it. I have no clue what I've changed to get it to work tbh, it was trial and error
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: elastic-agent
namespace: default
spec:
version: 8.12.0
kibanaRef:
name: kibana
fleetServerRef:
name: fleet-server
mode: fleet
policyID: eck-agent
daemonSet:
podTemplate:
spec:
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
serviceAccountName: elastic-agent
hostNetwork: true
hostPID: true
dnsPolicy: ClusterFirstWithHostNet
automountServiceAccountToken: true
containers:
- name: agent
volumeMounts:
- name: proc
mountPath: /hostfs/proc
readOnly: true
- name: cgroup
mountPath: /hostfs/sys/fs/cgroup
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
- name: etc-full
mountPath: /hostfs/etc
readOnly: true
- name: var-lib
mountPath: /hostfs/var/lib
readOnly: true
- name: etc-mid
mountPath: /etc/machine-id
readOnly: true
- name: sys-kernel-debug
mountPath: /sys/kernel/debug
securityContext:
runAsUser: 0
volumes:
- name: proc
hostPath:
path: /proc
- name: cgroup
hostPath:
path: /sys/fs/cgroup
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
- name: etc-full
hostPath:
path: /etc
- name: var-lib
hostPath:
path: /var/lib
- name: etc-mid
hostPath:
path: /etc/machine-id
type: File
- name: sys-kernel-debug
hostPath:
path: /sys/kernel/debug
system
(system)
Closed
March 8, 2024, 3:27pm
5
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.