Nginx Logs Can't Be Parsed Because Symlinks

Currently I have an nginx container that has filebeat running in the background. [nginx:latest as base]

I have enabled the nginx module with filebeat modules enable nginx, my filebeat.yml has an input defined for the logs and has "symlinks: true" enabled.

When I attempt to run it though I get an error because the Harvester is unable to read the symlinked files [/var/log/nginx/access.log|error.log]

I included my setup being used for the nginx module itself and my input. I found where the Harvester does the check and I am unsure if it supports reading from symlink'd files. beats/harvester.go at main · elastic/beats · GitHub L555

My setup:
~/.modules.d/nginx.yml:

- module: nginx
  access:
    enabled: true
    var.paths: ["/var/nginx/access.log*"]
  error:
    enabled: true
    var.paths: ["/var/log/nginx/error.log*"]

filebeat.yml:

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/*.log
  symlinks: true

ls -lrt /var/log/nginx :

lrwxrwxrwx 1 root root 11 Jan 11 06:31 error.log -> /dev/stderr
lrwxrwxrwx 1 root root 11 Jan 11 06:31 access.log -> /dev/stdout

Error logs:

{"file.name":"log/input.go","file.line":556},"message":"Harvester could not be started on new file: /var/log/nginx/access.log, Err: error setting up harvester: Harvester setup failed. Unexpected file opening error: Tried to open non regular file: \"Dcrw--w----\" access.log","service.name":"filebeat","input_id":"c84db31c-482a-42c6-95b7-be6e57aa822c","source_file":"/var/log/nginx/access.log","state_id":"native::3-175","finished":false,"os_id":"3-175","ecs.version":"1.6.0"}

TLDR; I am attempting to have filebeat read from /var/log/nginx/error.log|access.log in an nginx container. I am currently getting errors because the files are symlinked. I believe that I have enabled this in the nginx module itself but still get errors from the Harvester.

Hello @Christian_Jacobs

Filebeat does not support non-standard files. Similar question can be found in this thread.

As an option you can consider running filebeat as a separate container. You can check this page for running filebeat in docker, or this for running in kubernetes.

Thank you for the reply @Tetiana_Kravchenko ,

Unfortunately I am working in a serverless container environment [AWS Fargate].
For the approach that you are suggesting I would create a Filebeat sidecar, a shared volume between the nginx container and filebeat.

In the linked Docker article, the filebeat.docker.yaml fiel includes the below portion:

filebeat.autodiscover:
  providers:
    - type: docker

I do not believe that this will work in Fargate because I do not have access to the underlying docker service. But I can test.

AWS supports the Firelense log driver for Fargate and using a Fluentd sidecar to gather stdout/stderr from all containers in the same Task.

Do you have a recommended setup for a AWS Fargate environment, or is the approach with creating a volume for all containers to dump into, then a filebeat sidecar the recommended approach?

@Andrea_Spacca since it is in AWS domain, could you please have a look to this question?

Thank you for the replies @Tetiana_Kravchenko and @Andrea_Spacca ,

TLDR; Can I use the filebeat nginx module on input coming from TCP? And if so, how do I specify this behavior?

~ Detail
If it is possible for me to setup the nginx module to parse TCP inputs then that should fit my use case instead [if what I originally asked is not possible].

I setup a FluentD container in the Fargate task that is forwarding all logs to a Filebeat container in the same Task. [I am able to receive logs in the Filebeat container , forwarded from the Fluentd container.]

I now need a way to use the filebeat nginx module on the logs received from TCP

Below is an example filebeat.yml

filebeat.inputs:
- type: tcp
  max_message_size: 20MiB
  host: "filebeat:9999"

filebeat.config.modules:
  enabled: true
  path: /etc/filebeat/modules.d/*.yml
output.console:
  pretty: true

(edit:Nginx module is enabled, can provide yml if required.)

U can change the input of the module. Just add input: tcp and whatever additional input settings u want.

@legoguy1000 Thank you for the reply.

Is there any documentation I can follow for what you mentioned?

Is this "input:tcp" attribute put in ~/.modules.d/nginx.yml or another file?
I do not see this attribute looking at the nging module itself so I am assuming you are refering to possibly my filebeat.yml?

I am not trying to be obtuse, I am just struggling to find relevant examples for a serverless setup.

Also in this use case you mentioned using "input: tcp", I will have multiple types of logs coming in over the same TCP port. I just want to verify with this approach you suggest I am still able to separate logs out [for instance separating nginx logs recieved on tcp port from database logs also sent on the same port].
I have attributes in the message I can use to verify the log source, but I need to be able to set the parser based on the match in the incoming log.

Its kind of a hidden capability but u can override any modules input by just adding that config. so for nginx it could look like

- module: nginx
  access:
    enabled: true
    input: tcp
    port: 5151
......
  error:
    enabled: true
    var.paths: ["/path/to/log/nginx/error.log*"]

As for using the same port for multiple log types or modules, that wont work.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.