When I start filebeat with the config file below, I can't stop it using Ctrl + C
config file : debug.yaml
# genaral
max_procs: 1
logging.metrics.enabled: false
# log
logging.to_files: true
logging.files:
path: /data/home/logs
# queue
queue.mem:
events: 8192
flush.min_events: 4096
flush.timeout: 5s
# input
filebeat.inputs:
- close_timeout: "5m"
enabled: true
encoding: "utf-8"
exclude_files:
- "*"
ignore_older: "24h"
ignore_old_file_truncate: false
paths:
- "/data/home/test1.log"
scan_frequency: "10s"
tail_files: true
type: "log"
output.console:
pretty: true
Start command and result.
# ./filebeat7.17.2 -c debug.yaml
^C^C
After using Ctrl + C to stop filebeat, it is still running. So I have to kill it with "kill -9" in another session.
# ps -ef | grep filebeat
root 1786 27940 0 16:10 pts/0 00:00:00 ./filebeat7.17.2 -c debug.yaml
root 4577 30030 0 16:20 pts/1 00:00:00 grep --color=auto filebeat
# kill 1786
# ps -ef | grep filebeat
root 1786 27940 0 16:10 pts/0 00:00:00 ./filebeat7.17.2 -c debug.yaml
root 4705 30030 0 16:21 pts/1 00:00:00 grep --color=auto filebeat
# kill -9 1786
# ps -ef | grep filebeat
root 4750 30030 0 16:21 pts/1 00:00:00 grep --color=auto filebeat
The filebeat log is shown as follows.
2022-09-14T16:10:36.674+0800 INFO instance/beat.go:328 Setup Beat: filebeat; Version: 7.17.2
2022-09-14T16:10:36.674+0800 INFO [publisher] pipeline/module.go:113 Beat name: VM-77-52-centos
2022-09-14T16:10:36.675+0800 WARN beater/filebeat.go:202 Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-09-14T16:10:36.675+0800 INFO instance/beat.go:492 filebeat start running.
2022-09-14T16:10:36.675+0800 INFO memlog/store.go:119 Loading data file of '/data/home/filebeat/data/registry/filebeat' succeeded. Active transaction id=0
2022-09-14T16:10:36.780+0800 INFO memlog/store.go:124 Finished loading transaction log file for '/data/home/filebeat/data/registry/filebeat'. Active transaction id=8492
2022-09-14T16:10:36.780+0800 WARN beater/filebeat.go:411 Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-09-14T16:10:36.780+0800 INFO [registrar] registrar/registrar.go:109 States Loaded from registrar: 4
2022-09-14T16:10:36.780+0800 INFO [crawler] beater/crawler.go:71 Loading Inputs: 1
2022-09-14T16:10:36.781+0800 INFO [crawler] beater/crawler.go:117 starting input, keys present on the config: [filebeat.inputs.0.close_timeout filebeat.inputs.0.enabled filebeat.inputs.0.encoding filebeat.inputs.0.exclude_files.0 filebeat.inputs.0.ignore_old_file_truncate filebeat.inputs.0.ignore_older filebeat.inputs.0.paths.0 filebeat.inputs.0.scan_frequency filebeat.inputs.0.tail_files filebeat.inputs.0.type]
2022-09-14T16:10:36.781+0800 WARN [cfgwarn] log/input.go:89 DEPRECATED: Log input. Use Filestream input instead.
2022-09-14T16:10:36.781+0800 INFO beater/crawler.go:155 Stopping Crawler
2022-09-14T16:10:36.781+0800 INFO beater/crawler.go:165 Stopping 0 inputs
2022-09-14T16:10:36.781+0800 INFO beater/crawler.go:185 Crawler stopped
2022-09-14T16:10:36.781+0800 INFO [registrar] registrar/registrar.go:132 Stopping Registrar
2022-09-14T16:10:36.782+0800 INFO [registrar] registrar/registrar.go:166 Ending Registrar
2022-09-14T16:10:36.782+0800 INFO [registrar] registrar/registrar.go:137 Registrar stopped
When I start filebeat with the config file below, filebeat stopped with error
config file:debug2.yaml
# genaral
max_procs: 1
logging.metrics.enabled: false
# log
logging.to_files: true
logging.files:
path: /data/home/logs
# queue
queue.mem:
events: 8192
flush.min_events: 4096
flush.timeout: 5s
# input
filebeat.inputs:
- close_timeout: "5m"
enabled: true
encoding: "utf-8"
exclude_files: []
ignore_older: "24h"
ignore_old_file_truncate: false
paths:
- "/data/home/test.log"
scan_frequency: "10s"
tail_files: true
type: "log"
- close_timeout: "5m"
enabled: true
encoding: "utf-8"
exclude_files:
- "*"
ignore_older: "24h"
ignore_old_file_truncate: false
paths:
- "/data/home/test1.log"
scan_frequency: "10s"
tail_files: true
type: "log"
output.console:
pretty: true
Start command and result.
# ./filebeat7.17.2 -c debug2.yaml
Exiting: Failed to start crawler: starting input failed: error while initializing input: error parsing regexp: missing argument to repetition operator: `*` accessing 'filebeat.inputs.1.exclude_files.0' (source:'debug2.yaml')
The filebeat log is shown as follows:
2022-09-14T16:28:55.263+0800 INFO instance/beat.go:328 Setup Beat: filebeat; Version: 7.17.2
2022-09-14T16:28:55.263+0800 INFO [publisher] pipeline/module.go:113 Beat name: VM-77-52-centos
2022-09-14T16:28:55.263+0800 WARN beater/filebeat.go:202 Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-09-14T16:28:55.264+0800 INFO instance/beat.go:492 filebeat start running.
2022-09-14T16:28:55.264+0800 INFO memlog/store.go:119 Loading data file of '/data/home/filebeat/data/registry/filebeat' succeeded. Active transaction id=0
2022-09-14T16:28:55.370+0800 INFO memlog/store.go:124 Finished loading transaction log file for '/data/home/filebeat/data/registry/filebeat'. Active transaction id=8496
2022-09-14T16:28:55.370+0800 WARN beater/filebeat.go:411 Filebeat is unable to load the ingest pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the ingest pipelines or are using Logstash pipelines, you can ignore this warning.
2022-09-14T16:28:55.370+0800 INFO [registrar] registrar/registrar.go:109 States Loaded from registrar: 4
2022-09-14T16:28:55.370+0800 INFO [crawler] beater/crawler.go:71 Loading Inputs: 2
2022-09-14T16:28:55.371+0800 INFO [crawler] beater/crawler.go:117 starting input, keys present on the config: [filebeat.inputs.0.close_timeout filebeat.inputs.0.enabled filebeat.inputs.0.encoding filebeat.inputs.0.ignore_old_file_truncate filebeat.inputs.0.ignore_older filebeat.inputs.0.paths.0 filebeat.inputs.0.scan_frequency filebeat.inputs.0.tail_files filebeat.inputs.0.type]
2022-09-14T16:28:55.371+0800 WARN [cfgwarn] log/input.go:89 DEPRECATED: Log input. Use Filestream input instead.
2022-09-14T16:28:55.371+0800 INFO [input] log/input.go:171 Configured paths: [/data/home/test.log] {"input_id": "7d57477b-15e0-4967-941e-daaafec3ff2f"}
2022-09-14T16:28:55.371+0800 INFO [crawler] beater/crawler.go:148 Starting input (ID: 4560358641352877852)
2022-09-14T16:28:55.371+0800 INFO [crawler] beater/crawler.go:117 starting input, keys present on the config: [filebeat.inputs.1.close_timeout filebeat.inputs.1.enabled filebeat.inputs.1.encoding filebeat.inputs.1.exclude_files.0 filebeat.inputs.1.ignore_old_file_truncate filebeat.inputs.1.ignore_older filebeat.inputs.1.paths.0 filebeat.inputs.1.scan_frequency filebeat.inputs.1.tail_files filebeat.inputs.1.type]
2022-09-14T16:28:55.371+0800 INFO beater/crawler.go:155 Stopping Crawler
2022-09-14T16:28:55.371+0800 INFO beater/crawler.go:165 Stopping 1 inputs
2022-09-14T16:28:55.373+0800 INFO [crawler] beater/crawler.go:170 Stopping input: 4560358641352877852
2022-09-14T16:28:55.373+0800 INFO input/input.go:136 input ticker stopped
2022-09-14T16:28:55.373+0800 INFO beater/crawler.go:185 Crawler stopped
2022-09-14T16:28:55.373+0800 INFO [registrar] registrar/registrar.go:132 Stopping Registrar
2022-09-14T16:28:55.373+0800 INFO [registrar] registrar/registrar.go:166 Ending Registrar
2022-09-14T16:28:55.373+0800 INFO [registrar] registrar/registrar.go:137 Registrar stopped
2022-09-14T16:28:55.376+0800 INFO instance/beat.go:497 filebeat stopped.
After some digging it seems to hang on https://github.com/elastic/beats/blob/v7.17.2/filebeat/beater/filebeat.go:376 while it waits for a reference to the registry to be released, causing it to wait forever.
A detailed description will be given below.
When the first input in the config file is configured with exclude_files, and exclude_files is configured with an incorrect regular expression, such as "*". The error will be returned at filebeat.go:468 and the crawler will be stopped after that.
After the crawler stopped, we focus on filebeat.go:422, the taskgroup is stopped which means that no more goroutine can be started.
Taskgroup is used in filebeat.go:423 -> loader.go:68 -> manager.go:135.
As I mentioned above, the inability to release the reference to the registry caused the filebeat process to hang. The logic of releasing registry reference is in cim.shutdown()(manager.go:136). If the Go method(manager.go:135) successful start the task, cim.shutdown() can be finally executed and filebeat process can exit with an error.
However, the hung process means that some exception occurred when calling the Go method of the task group, causing the task not to execute.
So let's turn to the detail of the Go method (taskgroup.go:152)
func (t *TaskGroup) Go(fn func(context.Context) error) error {
t.init(context.Background())
if err := t.wg.Add(1); err != nil {
return err
}
go func() {
defer t.wg.Done()
for t.closer.Err() == nil {
err := fn(t.closer)
action, err := t.OnQuit(err)
if err != nil && err != context.Canceled {
t.mu.Lock()
t.errs = append(t.errs, err)
if t.MaxErrors > 0 && len(t.errs) > t.MaxErrors {
t.errs = t.errs[1:]
}
t.mu.Unlock()
}
switch action {
case TaskGroupStopActionContinue:
return // finish managed go-routine, but keep other routines alive.
case TaskGroupStopActionShutdown:
t.signalStop()
return
case TaskGroupStopActionRestart:
// continue with loop
}
}
}()
return nil
}
The task "fn" is only executed when "t.closer.Err() == nil". However, after the taskgroup stopped, t.closer.Err() is "context canceled" and the task can not be executed. It means that cim.shutdown()(manager.go:136) will never be called to release the reference of the registry.
When the first input causes the crawler to stop, the task group stops too quickly to start the goroutine that will eventually release the registry's reference.
Hence, the filebeat process only can be forced to stop with "kill -9".
The reference code file can be found as belows:
filebeat.go: beats/filebeat.go at v7.17.2 · elastic/beats · GitHub
loader.go: beats/loader.go at v7.17.2 · elastic/beats · GitHub
manager.go: beats/manager.go at v7.17.2 · elastic/beats · GitHub
taskgroup.go: go-concert/taskgroup.go at v0.2.0 · elastic/go-concert · GitHub