General guidelines for deploying Filebeat in Docker/Kubernetes is to run one instance (container) of Filebeat (in each Kubernetes node), and to harvest logs located in "/var/lib/docker/containers//.log".
However, I can't find a way to define prospector/process which will parse the logs, and do an additional parsing of the actual application log (Apache access logs, for example) - something like running the build-in modules on this field..
The only thing found in documentation is how to enhance the log with kubernetes info (pod name, container, etc)
Has anyone done such a thing?
Hi @Asher_Shoshan,
Have a look at Autodiscover settings: https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-autodiscover.html.
You can statically map images to the settings you want to use. You can also use labels to dynamically decide how to fetch logs from a container.
This feature was released with 6.1, and as always, feedback is really appreciated
Best regards
Thanks,
So should I use auto discover instead of :
processors:
- add_kubernetes_metadata:
in_cluster: true
namespace: ${POD_NAMESPACE}
Still, what to do with autodiscover in order to parse the log and see the fields in Kibana (for example all the fields of access log, http req, rc, etc), and not just one json field 'log'?
Tried 'autodiscover' with docker. Getting unable to connect to docker 'unix:///docker...."
2018/01/03 11:01:22.421490 beat.go:635: CRIT Exiting: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
You may need to review filebeat has permissions to access the socket file, or run it as root
Yes... Kubernetes as well
Can you elaborate? How to run it as root?
Sure, for kubernetes you have a complete example here: https://raw.githubusercontent.com/elastic/beats/6.1/deploy/kubernetes/filebeat-kubernetes.yaml
The trick comes from:
securityContext:
runAsUser: 0
For docker, just use docker run -u root
, you may want to pass --stirct.perms=false
to Filebeat, to avoid errors due to config file ownership
Thanks, but again... how to solve the original issue?
i.e. one filebeat container in Kube node harvests all containers logs in /var/log/docker/containers//.log.
Then all the app log (apache, redis, etc) is not parsed and moved to elastic (one field), therefore I can not get meaningful results in Kibana.
If there is a way to do the parsing inside the processor or reapply the built-in modules?
(haven't seen a way to define my own parsing)
typo: --stirct.perms=false SHB --strict.perms=false?
+1 to Ahsher's : "how to solve the original issue?"
Once you have Filebeat running you can use autodiscover (https://www.elastic.co/guide/en/beats/filebeat/6.1/configuration-autodiscover.html) adding this to your filebeat.yml:
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.image: "httpd"
config:
- module: apache2
access:
prospector:
type: docker
container.ids:
- "${data.docker.container.id}"
This detects apache instances (httpd
image) and launches Filebeat apache2
module to parse its logs (from the container)
Also, in order to get access to Docker you should mount the docker socket into Filebeat container, add these to your filebeat DaemonSet spec:
To volumeMounts
:
- name: dockersock
mountPath: /var/run/docker.sock
To volumes
:
- name: dockersock
hostPath:
path: /var/run/docker.sock
Hi,
Added the docker.sock, and filebeat started with no errors - however nothing is happening..
i.e autodiscovery is not triggering a new prospector when I launched a new container, and nothing arrived elasticsearch.
my filebeay.yml:
filebeat.modules:
- module: nginx
- module: apache2
- module: system
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
equals:
docker.container.name: "nginx"
config:
- module: nginx
log:
prospector:
type: docker
container.ids:
- "${data.docker.container.id}"
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:10.100.66.194}:${ELASTICSEARCH_PORT:9200}']
output.console:
pretty: true
enabled: false
Without autodiscovery, I used the below yml file, and again the "message" part of the json line (in the container logs), is not parsed by the module - so I can not really use it properly in Elasticsearch/Kibana.
filebeat.modules:
- module: nginx
filebeat.prospectors:
- type: docker
paths:
- /var/lib/docker/containers/*/*.log
templates:
- condition:
equals:
docker.container.name: "nginx"
config:
- module: nginx
log:
prospector:
type: docker
container.ids:
- "${data.docker.container.id}"
processors:
- add_docker_metadata:
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:10.100.66.194}:${ELASTICSEARCH_PORT:9200}']
output.console:
pretty: true
enabled: false
You may want to change the condition from equals
to contains
. As docker.container.name
includes the full path of the image + the version
Give it a try, if that doesn't work, you can review Autodiscover messages by running Filebeat with -d autodiscover,docker -e -v
flags
Also nginx module doesn't have a log
fileset, you will need to switch to access
:
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.name: "nginx"
config:
- module: nginx
access:
prospector:
type: docker
container.ids:
- "${data.docker.container.id}"
Hi,
Autodiscovery still is not working (with 'contains' + 'access' changes) - Have to try and run with debug.
As I have written before, (without autodiscovery) lines are coming to ElasticSearch but the message itself (access log line) is not parsed by filebeat. - It seems like the condition/filter is not working..
Is there a way to write/customize my own module? What is the approach for application logs (other than access log)? Should I use Logstash as well?
Here is the line as shown in Kibana:
{
"_index": "filebeat-6.1.1-2018.01.11",
"_type": "doc",
"_id": "AWDl2uwphR8b_vpAFHrW",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-01-11T15:33:36.442Z",
"offset": 685191,
"stream": "stdout",
"message": "10.45.104.170 - - [11/Jan/2018:15:33:36 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0" "-"",
"source": "/var/lib/docker/containers/039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19/039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19-json.log",
"prospector": {
"type": "docker"
},
"beat": {
"name": "e03aca2ac179",
"hostname": "e03aca2ac179",
"version": "6.1.1"
},
"docker": {
"container": {
"id": "039d7f5ed375cb77202718b677dd5d46beb670422b8778608befddfdf6c9bc19",
"labels": {
"maintainer": "NGINX Docker Maintainers docker-maint@nginx.com"
},
"image": "nginx",
"name": "nginx"
}
}
},
"fields": {
"@timestamp": [
1515684816442
]
},
"sort": [
1515684816442
]
}
Hi,
You can write your own modules, or use Logstash/Ingest node to parse raw lines from Filebeat. Here you have some useful links:
https://www.elastic.co/guide/en/beats/devguide/6.1/filebeat-modules-devguide.html
https://www.elastic.co/guide/en/logstash/6.1/advanced-pipeline.html
https://www.elastic.co/guide/en/beats/filebeat/6.1/configuring-ingest-node.html
Best regards
Thanks Carlos, I'm confused...
Is filebeat actually doing the parsing/changing of lines B4 sending to Elasticsearch or it uses 'ingest-mode', so the parsing is done at Elasticsearch endpoint.
If so, why there is a definition of 'modulus' in filebeat? So I must use both? i.e put the explicit pipeline in 'output' paragraph (so which one)...
Documentation is not so clear, and there are not many examples with this particular configuration (nginx/apache2/etc)
Please elaborate
Hi @Asher_Shoshan,
Filebeat modules package all the different parts of the stack you have to configure and deploy them for you, including: ingest pipeline, template mappings, dashboards and sometimes, machine learning jobs.
You can choose to use prospectors instead and do all the ingest settings by yourself. You can also write your own module to centralize all the settings.
Ok... (but how) See previous corresponding with the difficulties I had...
Since I'm working with Kuberentes and Docker, I'm looking for howto deploy One container of filebeat , which will parse All 'Docker' logs of all containers in that node. Therefore I need this filebeat container to act as a distributor by having some 'conditional' capabilities to be able to know how to parse (by metadata) before sending it to ElasticSearch.
I guess that other option, is to install filebeat on every container 'sidecar', and to ship the log directly to Elasticsearch, which is not the preferred way.
Wish I had some detailed examples/documentation or at least if anybody else is using it this way.
Thanks,
See below filebeat.yml which is working and parsed ok (with elastic ingest); but again - I'm missing the 'howto' handle (if then else) other logs (for example: nginx directs access logs + error log to the console, therefore it's in the same json file in the container - so what should I write in the yaml - now it's only the access log)
processors:
- add_kubernetes_metadata:
in_cluster: true
namespace: ${POD_NAMESPACE}
- add_docker_metadata:
filebeat.autodiscover:
providers:
- type: docker
templates:
- condition:
contains:
docker.container.name: "nginx"
config:
- module: nginx
access:
prospector:
type: docker
paths:
- /var/lib/docker/containers/${data.docker.container.id}/*.log
pipeline: filebeat-6.1.1-nginx-access-default
container.ids:
- "${data.docker.container.id}"
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']