Hello,
We are having some difficulty configuring our Filebeat to forward a large number of log files to our System.
Since we are deploying on another tenant's machine we are required to keep the resource consumption to a reasonable level. However with this many files it has become difficult for us to create a configuration that meets that requirement and keeps up with the updates.
Filebeat (and our Elastic Stack) version : 6.2.2
The machine with Filebeat is:
RHEL 5.9
2 CPUs
16 Gb RAM
The logging files:
Data Type 1:
- Number of files: 323+
- File creation: 1/day
- Update rate: 1 row/minute
- Total size per file: 80 KB
Data Type 2:
- Number of files: 323+
- File creation: 1/day
- Update rate: 1 row/minute
- Total size per file: 55 KB
Data Type 3:
- Number of files: 323+
- File creation: 1/day
- Update rate: ~6 row/10 minutes
- Total size per file: 2 KB
- Note: This file has strange logging behavior when it updates the file, It is deleted to and then recreated when updating new records. Optionally, to avoid this mess we are able to consume historical logs which are rolled over to at midnight and not updated after that.
Data Type 4:
- Number of files: 5
- File creation: 1/day
- Update rate: Entire file dumped at midnight
- Total size per file: 30 KB, 225 KB, 565 KB, 15 KB, 177 KB
All logs are single line/record
The initial configuration file:
filebeat.prospectors:
- ### Data Type 1
type: log
encoding: plain
# enabled: false
paths:
- {{dtype1.file_pattern}}
exclude_files: {{dtype1.exclude_pattern}}
fields:
topic: {{dtype1.topic}}
exclude_lines: [ '^#' ]
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 2
type: log
encoding: plain
# enabled: false
paths:
- {{dtype2.file_pattern}}
fields:
topic: {{dtype2.topic}}
exclude_lines: [ '^#' ]
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 3
type: log
encoding: plain
# enabled: false
paths:
- {{dtype3.historical.file_pattern}}
fields:
topic: {{dtype3.topic}}
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 4, File 1
type: log
encoding: plain
# enabled: false
paths:
- {{dtype4.file1.file_pattern}}
fields:
topic: {{dtype4.file1.topic}}
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 4, File 2
type: log
encoding: plain
# enabled: false
paths:
- {{dtype4.file2.file_pattern}}
fields:
topic: {{dtype4.file2.topic}}
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 4, File 3
type: log
encoding: plain
# enabled: false
paths:
- {{dtype4.file3.file_pattern}}
exclude_files: {{dtype4.file3.exclude_pattern}}
fields:
topic: {{dtype4.file3.topic}}
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 4, File 4
type: log
encoding: plain
# enabled: false
paths:
- {{dtype4.file4.file_patttern}}
fields:
topic: {{dtype4.file4.topic}}
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
- ### Data Type 4, File 5
type: log
encoding: plain
# enabled: false
paths:
- {{dtype4.file5.file_patttern}}
fields:
topic: {{dtype4.file5.topic}}
ignore_older: 168h
harvester_limit: 124
close_inactive: 1m
output:
kafka:
# enabled: false
hosts: "${KAF_SERVER_ALL}"
topic: '%{[fields][topic]}'
Any advice on this use case would be helpful.