Hi,
I'm trying Filebeat 5.0 with Logstash 2.4. Here is my configuration.
filebeat.prospectors:
- input_type: log
paths:
- /home/spark/*.log
ignore_older: 24h
clean_inactive: 25h
multiline.pattern: '^\[0-9]'
multiline.negate: true
multiline.match: after
multiline.max_lines: 1000000
multiline.timeout: 5m
output.logstash:
hosts: ["localhost:5044"]
logging.files:
keepfiles: 100
I found it works fine with the new coming files, but for a old file which is created two days before, the file size is about 30M and 300000 lines, though it's keep writing now, filebeat can't handle this file. Here is part of beginning of logs and keep showing the same lines.
2016-11-23T08:59:27Z DBG Check file for harvesting: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:27Z DBG Start harvester for new file: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:27Z DBG Setting offset for file based on seek: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:27Z DBG Setting offset for file: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr. Offset: 0
2016-11-23T08:59:27Z DBG New state added for /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:27Z DBG Prospector states cleaned up. Before: 1, After: 1
2016-11-23T08:59:27Z INFO Harvester started for file: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:28Z DBG End of file reached: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr; Backoff now.
2016-11-23T08:59:29Z DBG End of file reached: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr; Backoff now.
2016-11-23T08:59:31Z DBG End of file reached: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr; Backoff now.
2016-11-23T08:59:32Z DBG Flushing spooler because of timeout. Events flushed: 1
2016-11-23T08:59:32Z DBG No events to publish
2016-11-23T08:59:32Z DBG Events sent: 1
2016-11-23T08:59:32Z DBG Processing 1 events
2016-11-23T08:59:32Z DBG New state added for /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:32Z DBG Registrar states cleaned up. Before: 1 , After: 1
2016-11-23T08:59:32Z DBG Write registry file: /var/lib/filebeat/registry
2016-11-23T08:59:32Z DBG Registry file updated. 1 states written.
2016-11-23T08:59:35Z DBG End of file reached: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr; Backoff now.
2016-11-23T08:59:37Z DBG Flushing spooler because of timeout. Events flushed: 0
2016-11-23T08:59:37Z DBG Run prospector
2016-11-23T08:59:37Z DBG Start next scan
2016-11-23T08:59:37Z DBG Check file for harvesting: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:37Z DBG Update existing file for harvesting: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr, offset: 0
2016-11-23T08:59:37Z DBG Harvester for file is still running: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr
2016-11-23T08:59:37Z DBG Prospector states cleaned up. Before: 1, After: 1
2016-11-23T08:59:42Z DBG Flushing spooler because of timeout. Events flushed: 0
2016-11-23T08:59:43Z DBG End of file reached: /mnt/resource/hadoop/yarn/log/application_1479342877823_1077/container_e51_1479342877823_1077_01_000002/stderr; Backoff now.
2016-11-23T08:59:47Z DBG Run prospector
The offset in registry file is always 0. I know it's larger than the max_bytes which is 10Mib, but under this condition, filebeat isn't able to throw this event and keep handing the new lines? How should I handle this condition with filebeat?
In case the timeouts are different for the log files at the end, you can use multiple prospectors with different timeouts.