Filebeat deployment in Kubernetes/Docker

Asher_Shoshan · January 2, 2018, 10:53am

General guidelines for deploying Filebeat in Docker/Kubernetes is to run one instance (container) of Filebeat (in each Kubernetes node), and to harvest logs located in "/var/lib/docker/containers//.log".
However, I can't find a way to define prospector/process which will parse the logs, and do an additional parsing of the actual application log (Apache access logs, for example) - something like running the build-in modules on this field..
The only thing found in documentation is how to enhance the log with kubernetes info (pod name, container, etc)
Has anyone done such a thing?

exekias · January 2, 2018, 11:20am

Hi @Asher_Shoshan,

Have a look at Autodiscover settings: https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-autodiscover.html.

You can statically map images to the settings you want to use. You can also use labels to dynamically decide how to fetch logs from a container.

This feature was released with 6.1, and as always, feedback is really appreciated

Best regards

Asher_Shoshan · January 2, 2018, 11:57am

Thanks,
So should I use auto discover instead of :
processors:

add_kubernetes_metadata:
in_cluster: true
namespace: ${POD_NAMESPACE}

Still, what to do with autodiscover in order to parse the log and see the fields in Kibana (for example all the fields of access log, http req, rc, etc), and not just one json field 'log'?

Asher_Shoshan · January 3, 2018, 11:05am

Tried 'autodiscover' with docker. Getting unable to connect to docker 'unix:///docker...."

Asher_Shoshan · January 3, 2018, 11:06am

2018/01/03 11:01:22.421490 beat.go:635: CRIT Exiting: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

exekias · January 3, 2018, 11:24am

You may need to review filebeat has permissions to access the socket file, or run it as root

Asher_Shoshan · January 3, 2018, 11:36am

Yes... Kubernetes as well

Asher_Shoshan · January 3, 2018, 11:37am

Can you elaborate? How to run it as root?

exekias · January 3, 2018, 11:53am

Sure, for kubernetes you have a complete example here: https://raw.githubusercontent.com/elastic/beats/6.1/deploy/kubernetes/filebeat-kubernetes.yaml

The trick comes from:

securityContext:
  runAsUser: 0

For docker, just use docker run -u root, you may want to pass --stirct.perms=false to Filebeat, to avoid errors due to config file ownership

Asher_Shoshan · January 3, 2018, 12:32pm

Thanks, but again... how to solve the original issue?
i.e. one filebeat container in Kube node harvests all containers logs in /var/log/docker/containers//.log.
Then all the app log (apache, redis, etc) is not parsed and moved to elastic (one field), therefore I can not get meaningful results in Kibana.
If there is a way to do the parsing inside the processor or reapply the built-in modules?
(haven't seen a way to define my own parsing)

Patrick_Maroney · January 7, 2018, 10:01pm

typo: --stirct.perms=false SHB --strict.perms=false?

+1 to Ahsher's : "how to solve the original issue?"

exekias · January 7, 2018, 11:15pm

Once you have Filebeat running you can use autodiscover (https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-autodiscover.html) adding this to your filebeat.yml:

filebeat.autodiscover:
  providers:
    - type: docker
      templates:
        - condition:
            contains:
              docker.container.image: "httpd"
          config:
            - module: apache2
              access:
                prospector:
                  type: docker
                  container.ids:
                    - "${data.docker.container.id}"

This detects apache instances (httpd image) and launches Filebeat apache2 module to parse its logs (from the container)

Also, in order to get access to Docker you should mount the docker socket into Filebeat container, add these to your filebeat DaemonSet spec:

To volumeMounts:

- name: dockersock
  mountPath: /var/run/docker.sock

To volumes:

- name: dockersock
  hostPath:
  path: /var/run/docker.sock

Asher_Shoshan · January 8, 2018, 4:13pm

Hi,
Added the docker.sock, and filebeat started with no errors - however nothing is happening..
i.e autodiscovery is not triggering a new prospector when I launched a new container, and nothing arrived elasticsearch.

my filebeay.yml:

filebeat.modules:
 - module: nginx
 - module: apache2
 - module: system

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           equals:
             docker.container.name: "nginx"
         config:
           - module: nginx
             log:
               prospector:
                 type: docker
                 container.ids:
                   - "${data.docker.container.id}"

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:10.100.66.194}:${ELASTICSEARCH_PORT:9200}']

output.console:
  pretty: true
  enabled: false

Without autodiscovery, I used the below yml file, and again the "message" part of the json line (in the container logs), is not parsed by the module - so I can not really use it properly in Elasticsearch/Kibana.

filebeat.modules:
 - module: nginx

filebeat.prospectors:
 - type: docker
   paths:
    - /var/lib/docker/containers/*/*.log
   templates:
     - condition:
         equals:
           docker.container.name: "nginx"
       config:
         - module: nginx
           log:
             prospector:
               type: docker
               container.ids:
                 - "${data.docker.container.id}"
processors:
  - add_docker_metadata:

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:10.100.66.194}:${ELASTICSEARCH_PORT:9200}']

output.console:
  pretty: true
  enabled: false

exekias · January 9, 2018, 1:54pm

You may want to change the condition from equals to contains. As docker.container.name includes the full path of the image + the version

Give it a try, if that doesn't work, you can review Autodiscover messages by running Filebeat with -d autodiscover,docker -e -v flags

exekias · January 9, 2018, 1:56pm

Also nginx module doesn't have a log fileset, you will need to switch to access:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           contains:
             docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 container.ids:
                   - "${data.docker.container.id}"

Asher_Shoshan · January 11, 2018, 3:17pm

Hi,
Autodiscovery still is not working (with 'contains' + 'access' changes) - Have to try and run with debug.
As I have written before, (without autodiscovery) lines are coming to ElasticSearch but the message itself (access log line) is not parsed by filebeat. - It seems like the condition/filter is not working..
Is there a way to write/customize my own module? What is the approach for application logs (other than access log)? Should I use Logstash as well?

Here is the line as shown in Kibana:
{
"_index": "filebeat-6.1.1-2018.01.11",
"_type": "doc",
"_id": "AWDl2uwphR8b_vpAFHrW",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-01-11T15:33:36.442Z",
"offset": 685191,
"stream": "stdout",
"message": "10.45.104.170 - - [11/Jan/2018:15:33:36 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0" "-"",
"source": "/var/lib/docker/containers/039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19/039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19-json.log",
"prospector": {
"type": "docker"
},
"beat": {
"name": "e03aca2ac179",
"hostname": "e03aca2ac179",
"version": "6.1.1"
},
"docker": {
"container": {
"id": "039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19",
"labels": {
"maintainer": "NGINX Docker Maintainers docker-maint@nginx.com"
},
"image": "nginx",
"name": "nginx"
}
}
},
"fields": {
"@timestamp": [
1515684816442
]
},
"sort": [
1515684816442
]
}

exekias · January 12, 2018, 1:22pm

Hi,

You can write your own modules, or use Logstash/Ingest node to parse raw lines from Filebeat. Here you have some useful links:

https://www.elastic.co/guide/en/beats/devguide/6.1/filebeat-modules-devguide.html

https://www.elastic.co/guide/en/logstash/6.1/advanced-pipeline.html

https://www.elastic.co/guide/en/beats/filebeat/6.1/configuring-ingest-node.html

Best regards

Asher_Shoshan · January 14, 2018, 2:18pm

Thanks Carlos, I'm confused...

Is filebeat actually doing the parsing/changing of lines B4 sending to Elasticsearch or it uses 'ingest-mode', so the parsing is done at Elasticsearch endpoint.
If so, why there is a definition of 'modulus' in filebeat? So I must use both? i.e put the explicit pipeline in 'output' paragraph (so which one)...

Documentation is not so clear, and there are not many examples with this particular configuration (nginx/apache2/etc)

Please elaborate

exekias · January 15, 2018, 11:51am

Hi @Asher_Shoshan,

Filebeat modules package all the different parts of the stack you have to configure and deploy them for you, including: ingest pipeline, template mappings, dashboards and sometimes, machine learning jobs.

You can choose to use prospectors instead and do all the ingest settings by yourself. You can also write your own module to centralize all the settings.

Asher_Shoshan · January 15, 2018, 4:53pm

Ok... (but how) See previous corresponding with the difficulties I had...

Since I'm working with Kuberentes and Docker, I'm looking for howto deploy One container of filebeat , which will parse All 'Docker' logs of all containers in that node. Therefore I need this filebeat container to act as a distributor by having some 'conditional' capabilities to be able to know how to parse (by metadata) before sending it to ElasticSearch.
I guess that other option, is to install filebeat on every container 'sidecar', and to ship the log directly to Elasticsearch, which is not the preferred way.

Wish I had some detailed examples/documentation or at least if anybody else is using it this way.
Thanks,

See below filebeat.yml which is working and parsed ok (with elastic ingest); but again - I'm missing the 'howto' handle (if then else) other logs (for example: nginx directs access logs + error log to the console, therefore it's in the same json file in the container - so what should I write in the yaml - now it's only the access log)

processors:
    - add_kubernetes_metadata:
        in_cluster: true
        namespace: ${POD_NAMESPACE}
    - add_docker_metadata:

filebeat.autodiscover:
  providers:
   - type: docker
     templates:
       - condition:
           contains:
             docker.container.name: "nginx"
         config:
           - module: nginx
             access:
               prospector:
                 type: docker
                 paths:
                 - /var/lib/docker/containers/${data.docker.container.id}/*.log
                 pipeline: filebeat-6.1.1-nginx-access-default
                 container.ids:
                   - "${data.docker.container.id}"

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']

Topic		Replies	Views
Problem getting autodiscover docker to work with filebeat Beats filebeat	11	8879	September 12, 2018
Help needed: Filebeat Container input > Elasticsearch >Kibana Beats docker , filebeat	12	5154	October 22, 2019
Filebeat: docker autodiscover - do I have things right? Beats filebeat	6	2269	August 6, 2018
Filebeat not harvesting anything in Kubernetes Beats filebeat	5	1492	May 15, 2019
Filebeat Kubernetes autodiscover with post "processor" specific field with another filebeat module Beats filebeat	8	980	February 3, 2023

Filebeat deployment in Kubernetes/Docker

Related topics