Filebeat as Daemonset for all Kubernetes Logs (including nginx)

I have attached my filebeat which was taken from https://github.com/elastic/beats/blob/6.5/deploy/kubernetes/filebeat-kubernetes.yaml + Added ES endpoint with creds + disabled filebeat.config.inputs + enabled autodiscover + added annotations.
I was hoping the NGINX logs would get parsed, but they don't seem to be. Logged here https://github.com/elastic/beats/issues/9768.

Anybody who has come across this?

Image: docker.elastic.co/beats/filebeat:6.5.1
autodiscover:

filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true

annotations:

spec:
  template:
    metadata:
      labels:
        app: filebeat
      annotations:
        co.elastic.logs/module: nginx
        co.elastic.logs/fileset.stdout: access
        co.elastic.logs/fileset.stderr: error

Complete filebeat-kubernetes.yaml

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
      #inputs:
        # Mounted `filebeat-inputs` configmap:
        # path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        # reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    # To enable hints based autodiscover, remove `filebeat.config.inputs` configuration and uncomment this:
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          hints.enabled: true

    processors:
      - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      protocol: ${ELASTICSEARCH_PROTOCOL}
      #index: ${ELASTICSEARCH_INDEX}
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
    setup.template:
      enabled: true
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-inputs
  namespace: kube-system
  labels:
    app: filebeat
data:
  kubernetes.yml: |-
    - type: docker
      containers.ids:
      - "*"
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    app: filebeat
spec:
  template:
    metadata:
      labels:
        app: filebeat
      annotations:
        co.elastic.logs/module: nginx
        co.elastic.logs/fileset.stdout: access
        co.elastic.logs/fileset.stderr: error
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
      - name: filebeat
        image: docker.elastic.co/beats/filebeat:6.5.1
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: ELASTICSEARCH_HOST
          value: es.mycompany.com
        - name: ELASTICSEARCH_PORT
          value: "9243"
        - name: ELASTICSEARCH_USERNAME
          value: mycompany
        - name: ELASTICSEARCH_PASSWORD
          value: mycompany
        - name: ELASTICSEARCH_PROTOCOL
          value: https
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: inputs
          mountPath: /usr/share/filebeat/inputs.d
          readOnly: true
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: inputs
        configMap:
          defaultMode: 0600
          name: filebeat-inputs
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    app: filebeat
---

Hey @sujituk

Those annotations you've mentioned will want to go on the pods that are producing the Nginx logs, rather than the Filebeat Daemonset:

co.elastic.logs/module: nginx
co.elastic.logs/fileset.stdout: access
co.elastic.logs/fileset.stderr: error

This way you can customise the parsing behaviour for logs on a per pod basis.

1 Like

@Evesy Got it. I will try that and keep you posted.
The logformat is a customized one from the kubernetes nginx controller

log_format upstreaminfo
    '{{ if $cfg.useProxyProtocol }}$proxy_protocol_addr{{ else }}$remote_addr{{ end }} - '
    '[$the_real_ip] - $remote_user [$time_local] "$request" '
    '$status $body_bytes_sent "$http_referer" "$http_user_agent" '
    '$request_length $request_time [$proxy_upstream_name] $upstream_addr '
    '$upstream_response_length $upstream_response_time $upstream_status $req_id';

Do you think this should be sent to a logstash pipeline with appropriate grok format? Currently, I'm sending it directly to elasticsearch and expecting the grokking to happen.

We have the same usecase @sujituk

We created a custom filebeat module specifically for ingress nginx, and in that module included an Elasticsearch ingest pipeline that would do the processing for us (Grok field parsing + GeoIP processing + User agent processing) .
The docs here can help getting your own module created: https://www.elastic.co/guide/en/beats/devguide/current/filebeat-modules-devguide.html Also happy to assist if you have any questions.

You could of course just send to Logstash if it's easier for you, however I prefer the integration with Filebeat & Ingest pipelines

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.