Hello, I’m trying to collect logs from certain folder using filebeat 8.18.3. All the logs are in subfolders of this folder. However in the same directory as logs there 2 folders that contains few millions of xml files that are not logs(not my decision, I have no idea why it’s there). So when I run filebeat, it starts using more and more ram and crashes.
Here is part of config:
filebeat.inputs:
# filestream is an input for collecting log messages from files.
- type: filestream
# Unique ID among all inputs, an ID is required.
id: "some-id"
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /some/path/logs/**/*.txt
prospector.scanner.exclude_files: [
'^\/some\/path\/logs\/swap_files\/.+',
'^\/some\/path\/logs\/export_files\/.+'
]
pipeline: "some-pipeline"
parsers:
- multiline:
pattern: '^\[\d{2}:\d{2}:\d{2}\.\d{3} [A-Z]{3}\]'
negate: true
match: after
I tried also different exclude files:
``
prospector.scanner.exclude_files: [
'^\/some\/path\/logs\/swap_files\/',
'^\/some\/path\/logs\/export_files\/']
```
prospector.scanner.exclude_files: [
'^/some/path/logs/swap_files/',
'^/some/path/logs/export_files/'
]
However none of it seems to work. From what I understand it still tries match regex on all of files names so it doesn’t help to solve my problem. For now I just specify multiple paths for every subfolder except those 2 but this doesn’t seem like scalable solution.