Filebeat deployment in Kubernetes/Docker

General guidelines for deploying Filebeat in Docker/Kubernetes is to run one instance (container) of Filebeat (in each Kubernetes node), and to harvest logs located in "/var/lib/docker/containers//.log".
However, I can't find a way to define prospector/process which will parse the logs, and do an additional parsing of the actual application log (Apache access logs, for example) - something like running the build-in modules on this field..
The only thing found in documentation is how to enhance the log with kubernetes info (pod name, container, etc)
Has anyone done such a thing?

1 Like

Hi @Asher_Shoshan,

Have a look at Autodiscover settings: https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-autodiscover.html.

You can statically map images to the settings you want to use. You can also use labels to dynamically decide how to fetch logs from a container.

This feature was released with 6.1, and as always, feedback is really appreciated :slight_smile:

Best regards

Thanks,
So should I use auto discover instead of :
processors:

  • add_kubernetes_metadata:
    in_cluster: true
    namespace: ${POD_NAMESPACE}

Still, what to do with autodiscover in order to parse the log and see the fields in Kibana (for example all the fields of access log, http req, rc, etc), and not just one json field 'log'?

Tried 'autodiscover' with docker. Getting unable to connect to docker 'unix:///docker...."

2018/01/03 11:01:22.421490 beat.go:635: CRIT Exiting: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

You may need to review filebeat has permissions to access the socket file, or run it as root

Yes... Kubernetes as well

Can you elaborate? How to run it as root?

Sure, for kubernetes you have a complete example here: https://raw.githubusercontent.com/elastic/beats/6.1/deploy/kubernetes/filebeat-kubernetes.yaml

The trick comes from:

securityContext:
  runAsUser: 0

For docker, just use docker run -u root, you may want to pass --stirct.perms=false to Filebeat, to avoid errors due to config file ownership

Thanks, but again... how to solve the original issue?
i.e. one filebeat container in Kube node harvests all containers logs in /var/log/docker/containers//.log.
Then all the app log (apache, redis, etc) is not parsed and moved to elastic (one field), therefore I can not get meaningful results in Kibana.
If there is a way to do the parsing inside the processor or reapply the built-in modules?
(haven't seen a way to define my own parsing)

typo: --stirct.perms=false SHB --strict.perms=false?

+1 to Ahsher's : "how to solve the original issue?"

Once you have Filebeat running you can use autodiscover (https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-autodiscover.html) adding this to your filebeat.yml:

filebeat.autodiscover:
  providers:
    - type: docker
      templates:
        - condition:
            contains:
              docker.container.image: "httpd"
          config:
            - module: apache2
              access:
                prospector:
                  type: docker
                  container.ids:
                    - "${data.docker.container.id}"

This detects apache instances (httpd image) and launches Filebeat apache2 module to parse its logs (from the container)

Also, in order to get access to Docker you should mount the docker socket into Filebeat container, add these to your filebeat DaemonSet spec:

To volumeMounts:

- name: dockersock
  mountPath: /var/run/docker.sock

To volumes:

- name: dockersock
  hostPath:
  path: /var/run/docker.sock

Hi,
Added the docker.sock, and filebeat started with no errors - however nothing is happening..
i.e autodiscovery is not triggering a new prospector when I launched a new container, and nothing arrived elasticsearch.

my filebeay.yml:

filebeat.modules:
 - module: nginx
 - module: apache2
 - module: system

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           equals:
             docker.container.name: "nginx"
         config:
           - module: nginx
             log:
               prospector:
                 type: docker
                 container.ids:
                   - "${data.docker.container.id}"

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:10.100.66.194}:${ELASTICSEARCH_PORT:9200}']

output.console:
  pretty: true
  enabled: false

Without autodiscovery, I used the below yml file, and again the "message" part of the json line (in the container logs), is not parsed by the module - so I can not really use it properly in Elasticsearch/Kibana.

filebeat.modules:
 - module: nginx

filebeat.prospectors:
 - type: docker
   paths:
    - /var/lib/docker/containers/*/*.log
   templates:
     - condition:
         equals:
           docker.container.name: "nginx"
       config:
         - module: nginx
           log:
             prospector:
               type: docker
               container.ids:
                 - "${data.docker.container.id}"
processors:
  - add_docker_metadata:

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:10.100.66.194}:${ELASTICSEARCH_PORT:9200}']

output.console:
  pretty: true
  enabled: false

You may want to change the condition from equals to contains. As docker.container.name includes the full path of the image + the version

Give it a try, if that doesn't work, you can review Autodiscover messages by running Filebeat with -d autodiscover,docker -e -v flags

Also nginx module doesn't have a log fileset, you will need to switch to access:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           contains:
             docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 container.ids:
                   - "${data.docker.container.id}"

Hi,
Autodiscovery still is not working (with 'contains' + 'access' changes) - Have to try and run with debug.
As I have written before, (without autodiscovery) lines are coming to ElasticSearch but the message itself (access log line) is not parsed by filebeat. - It seems like the condition/filter is not working..
Is there a way to write/customize my own module? What is the approach for application logs (other than access log)? Should I use Logstash as well?

Here is the line as shown in Kibana:
{
"_index": "filebeat-6.1.1-2018.01.11",
"_type": "doc",
"_id": "AWDl2uwphR8b_vpAFHrW",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-01-11T15:33:36.442Z",
"offset": 685191,
"stream": "stdout",
"message": "10.45.104.170 - - [11/Jan/2018:15:33:36 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0" "-"",
"source": "/var/lib/docker/containers/039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19/039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19-json.log",
"prospector": {
"type": "docker"
},
"beat": {
"name": "e03aca2ac179",
"hostname": "e03aca2ac179",
"version": "6.1.1"
},
"docker": {
"container": {
"id": "039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19",
"labels": {
"maintainer": "NGINX Docker Maintainers docker-maint@nginx.com"
},
"image": "nginx",
"name": "nginx"
}
}
},
"fields": {
"@timestamp": [
1515684816442
]
},
"sort": [
1515684816442
]
}

Hi,

You can write your own modules, or use Logstash/Ingest node to parse raw lines from Filebeat. Here you have some useful links:

https://www.elastic.co/guide/en/beats/devguide/6.1/filebeat-modules-devguide.html

https://www.elastic.co/guide/en/logstash/6.1/advanced-pipeline.html

https://www.elastic.co/guide/en/beats/filebeat/6.1/configuring-ingest-node.html

Best regards

Thanks Carlos, I'm confused...

Is filebeat actually doing the parsing/changing of lines B4 sending to Elasticsearch or it uses 'ingest-mode', so the parsing is done at Elasticsearch endpoint.
If so, why there is a definition of 'modulus' in filebeat? So I must use both? i.e put the explicit pipeline in 'output' paragraph (so which one)...

Documentation is not so clear, and there are not many examples with this particular configuration (nginx/apache2/etc)

Please elaborate

Hi @Asher_Shoshan,

Filebeat modules package all the different parts of the stack you have to configure and deploy them for you, including: ingest pipeline, template mappings, dashboards and sometimes, machine learning jobs.

You can choose to use prospectors instead and do all the ingest settings by yourself. You can also write your own module to centralize all the settings.

Ok... (but how) See previous corresponding with the difficulties I had...

Since I'm working with Kuberentes and Docker, I'm looking for howto deploy One container of filebeat , which will parse All 'Docker' logs of all containers in that node. Therefore I need this filebeat container to act as a distributor by having some 'conditional' capabilities to be able to know how to parse (by metadata) before sending it to ElasticSearch.
I guess that other option, is to install filebeat on every container 'sidecar', and to ship the log directly to Elasticsearch, which is not the preferred way.

Wish I had some detailed examples/documentation or at least if anybody else is using it this way.
Thanks,

See below filebeat.yml which is working and parsed ok (with elastic ingest); but again - I'm missing the 'howto' handle (if then else) other logs (for example: nginx directs access logs + error log to the console, therefore it's in the same json file in the container - so what should I write in the yaml - now it's only the access log)

processors:
    - add_kubernetes_metadata:
        in_cluster: true
        namespace: ${POD_NAMESPACE}
    - add_docker_metadata:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           contains:
             docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 paths:
                 - /var/lib/docker/containers/${data.docker.container.id}/*.log
                 pipeline: filebeat-6.1.1-nginx-access-default
                 container.ids:
                   - "${data.docker.container.id}"

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']