Hello!
On our k8s infrastructure, we are face to the connection issue while starting the filebeat POD.
Affected filebeat versions are 6.4.0, 7.0.
Log record is:
ERROR kubernetes/kubernetes.go:127 Error starting kubernetes autodiscover provider: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
The k8s manifest configuration is:
filebeat-daemonset.yml
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config-json
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
filebeat.registry_file: "/opt/filebeat/registry"
logging.to_stderr: true
filebeat.autodiscover:
providers:
- type: kubernetes
include_annotations: ["logging"]
templates:
- condition:
equals:
kubernetes.annotations.logging: "plain"
config:
- type: docker
containers.ids:
- "${data.kubernetes.container.id}"
processors:
- add_cloud_metadata: ~
processors:
- drop_fields:
fields: ["host"]
output.logstash:
hosts: ["logstash-host:9200"]
logging:
to_files: true
files:
path: "/opt/filebeat/logs"
name: filebeat.log
rotateeverybytes: 10485760 # = 10MB
keepfiles: 7
level: info
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat-daemon
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
name: filebeat
k8s-app: filebeat
template:
metadata:
labels:
name: filebeat
k8s-app: filebeat
spec:
containers:
- name: filebeat-container
image: docker-registry-host/shared/filebeat
volumeMounts:
- mountPath: /var/lib/docker/containers
name: filebeat-log-source
- name: filebeat-config
mountPath: /opt/filebeat
- name: varlogcont
mountPath: /var/log/containers
readOnly: true
- name: varlogpods
mountPath: /var/log/pods
readOnly: true
imagePullSecrets:
- name: regcred
volumes:
- name: filebeat-log-source
hostPath:
path: /var/lib/docker/containers
type: DirectoryOrCreate
- name: filebeat-config
configMap:
defaultMode: 0644
name: filebeat-config-json
- name: varlogcont
hostPath:
path: /var/log/containers
type: DirectoryOrCreate
- name: varlogpods
hostPath:
path: /var/log/pods
type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: default
namespace: default
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: filebeat
labels:
k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
verbs:
- get
- watch
- list
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: default
labels:
k8s-app: filebeat
The log messages are:
log-messages
...
2019-07-25T03:23:53.420Z WARN [cfgwarn] kubernetes/kubernetes.go:51 BETA: The kubernetes autodiscover is beta
2019-07-25T03:23:53.421Z INFO kubernetes/util.go:86 kubernetes: Using pod name filebeat-daemon-srjr4 and namespace default to discover kubernetes node
2019-07-25T03:23:53.423Z ERROR kubernetes/util.go:90 kubernetes: Querying for pod failed with error: %!(EXTRA string=performing request: Get https://10.96.0.1:443/api/v1/namespaces/default/pods/filebeat-daemon-srjr4: dial tcp 10.96.0.1:443: connect: connection refused)
2019-07-25T03:23:53.423Z INFO autodiscover/autodiscover.go:105 Starting autodiscover manager
2019-07-25T03:23:53.423Z INFO kubernetes/watcher.go:180 kubernetes: Performing a resource sync for *v1.PodList
2019-07-25T03:23:53.424Z ERROR kubernetes/watcher.go:183 kubernetes: Performing a resource sync err performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused for *v1.PodList
2019-07-25T03:23:53.424Z ERROR kubernetes/kubernetes.go:127 Error starting kubernetes autodiscover provider: performing request: Get https://10.96.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
2019-07-25T03:24:23.418Z INFO [monitoring] log/log.go:141 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":30,"time":{"ms":33}},"total":{"ticks":60,"time":{"ms":63},"value":60},"user":{"ticks":30,"time":{"ms":30}}},"info":{"ephemeral_id":"03762f04-9dd8-416e-829c-d9bcba1fde01","uptime":{"ms":30020}},"memstats":{"gc_next":4194304,"memory_alloc":1571712,"memory_total":3527480,"rss":19968000}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"type":"logstash"},"pipeline":{"clients":0,"events":{"active":0}}},"registrar":{"states":{"current":0},"writes":{"success":1,"total":1}},"system":{"cpu":{"cores":2},"load":{"1":5.7,"15":4.33,"5":4.47,"norm":{"1":2.85,"15":2.165,"5":2.235}}}}}}
While investigating this issue, we found the temporary solution on the StackOverflow:
but the solution depends on sleep, that we do not predict on different infrastructures.
The suggested solution is to retry on start in the k8s client.
Also, we found these issues on Beats GitHub about current kubernetes client in
and
My questions are there are plans to fix those issues and on which dates?