Nginx Logs Can't Be Parsed Because Symlinks

Currently I have an nginx container that has filebeat running in the background. [nginx:latest as base]

I have enabled the nginx module with filebeat modules enable nginx, my filebeat.yml has an input defined for the logs and has "symlinks: true" enabled.

When I attempt to run it though I get an error because the Harvester is unable to read the symlinked files [/var/log/nginx/access.log|error.log]

I included my setup being used for the nginx module itself and my input. I found where the Harvester does the check and I am unsure if it supports reading from symlink'd files. beats/harvester.go at main · elastic/beats · GitHub L555

My setup:
~/.modules.d/nginx.yml:

- module: nginx
  access:
    enabled: true
    var.paths: ["/var/nginx/access.log*"]
  error:
    enabled: true
    var.paths: ["/var/log/nginx/error.log*"]

filebeat.yml:

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/*.log
  symlinks: true

ls -lrt /var/log/nginx :

lrwxrwxrwx 1 root root 11 Jan 11 06:31 error.log -> /dev/stderr
lrwxrwxrwx 1 root root 11 Jan 11 06:31 access.log -> /dev/stdout

Error logs:

{"file.name":"log/input.go","file.line":556},"message":"Harvester could not be started on new file: /var/log/nginx/access.log, Err: error setting up harvester: Harvester setup failed. Unexpected file opening error: Tried to open non regular file: \"Dcrw--w----\" access.log","service.name":"filebeat","input_id":"c84db31c-482a-42c6-95b7-be6e57aa822c","source_file":"/var/log/nginx/access.log","state_id":"native::3-175","finished":false,"os_id":"3-175","ecs.version":"1.6.0"}

TLDR; I am attempting to have filebeat read from /var/log/nginx/error.log|access.log in an nginx container. I am currently getting errors because the files are symlinked. I believe that I have enabled this in the nginx module itself but still get errors from the Harvester.

Hello @Christian_Jacobs

Filebeat does not support non-standard files. Similar question can be found in this thread.

As an option you can consider running filebeat as a separate container. You can check this page for running filebeat in docker, or this for running in kubernetes.

Thank you for the reply @Tetiana_Kravchenko ,

Unfortunately I am working in a serverless container environment [AWS Fargate].
For the approach that you are suggesting I would create a Filebeat sidecar, a shared volume between the nginx container and filebeat.

In the linked Docker article, the filebeat.docker.yaml fiel includes the below portion:

filebeat.autodiscover:
  providers:
    - type: docker

I do not believe that this will work in Fargate because I do not have access to the underlying docker service. But I can test.

AWS supports the Firelense log driver for Fargate and using a Fluentd sidecar to gather stdout/stderr from all containers in the same Task.

Do you have a recommended setup for a AWS Fargate environment, or is the approach with creating a volume for all containers to dump into, then a filebeat sidecar the recommended approach?

@Andrea_Spacca since it is in AWS domain, could you please have a look to this question?

Thank you for the replies @Tetiana_Kravchenko and @Andrea_Spacca ,

TLDR; Can I use the filebeat nginx module on input coming from TCP? And if so, how do I specify this behavior?

~ Detail
If it is possible for me to setup the nginx module to parse TCP inputs then that should fit my use case instead [if what I originally asked is not possible].

I setup a FluentD container in the Fargate task that is forwarding all logs to a Filebeat container in the same Task. [I am able to receive logs in the Filebeat container , forwarded from the Fluentd container.]

I now need a way to use the filebeat nginx module on the logs received from TCP

Below is an example filebeat.yml

filebeat.inputs:
- type: tcp
  max_message_size: 20MiB
  host: "filebeat:9999"

filebeat.config.modules:
  enabled: true
  path: /etc/filebeat/modules.d/*.yml
output.console:
  pretty: true

(edit:Nginx module is enabled, can provide yml if required.)

U can change the input of the module. Just add input: tcp and whatever additional input settings u want.

@legoguy1000 Thank you for the reply.

Is there any documentation I can follow for what you mentioned?

Is this "input:tcp" attribute put in ~/.modules.d/nginx.yml or another file?
I do not see this attribute looking at the nging module itself so I am assuming you are refering to possibly my filebeat.yml?

I am not trying to be obtuse, I am just struggling to find relevant examples for a serverless setup.

Also in this use case you mentioned using "input: tcp", I will have multiple types of logs coming in over the same TCP port. I just want to verify with this approach you suggest I am still able to separate logs out [for instance separating nginx logs recieved on tcp port from database logs also sent on the same port].
I have attributes in the message I can use to verify the log source, but I need to be able to set the parser based on the match in the incoming log.

Its kind of a hidden capability but u can override any modules input by just adding that config. so for nginx it could look like

- module: nginx
  access:
    enabled: true
    input: tcp
    port: 5151
......
  error:
    enabled: true
    var.paths: ["/path/to/log/nginx/error.log*"]

As for using the same port for multiple log types or modules, that wont work.