Добрый день!
Я столкнулся с проблемой многострочной обработки в Filebeat, когда в параметрах filebeat.inputs: указано type: filestream - журналы файлового потока не анализируются в соответствии с требованиями multiline.pattern: '^[[0-9]{4}-[0-9]{2}-[0-9]{2}', на выходе я вижу, что создаются однострочные сообщения с отдельными строками из файла журнала, ожидалось что на выходе я получу многострочные сообщения являющиеся частью одного события.
Если в параметрах filebeat.inputs: указать type: log, то все работает правильно, в соответствии с требованиями multiline.pattern: '^[[0-9]{4}-[0-9]{2}-[0-9]{2}'- создаются многострочное сообщение.
Что не правильно указано в моем конфиге?
filebeat.inputs:
- type: filestream
enabled: true
paths:
- C:\tmp\GT\xml\*\*.log
fields_under_root: true
fields:
system: xml
subsystem: GT
multiline.type: pattern
multiline.pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: false
multiline.match: after
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
tags: [xml]
output.file:
enabled: true
path: "C:/tmp/output/"
filename: filebeat
logging:
to_files: true
files:
path: C:/ProgramData/filebeat/Logs
level: debug
permissions: 0644
содержимое файла журнала:
Use the filestream input to read lines from active log files. It is the new, improved alternative to the log input. It comes with various improvements to the existing input:
Checking of close_* options happens out of band. Thus, if an output is blocked, Filebeat can close the reader and avoid keeping too many files open.
Detailed metrics are available for all files that match the paths configuration regardless of the harvester_limit. This way, you can keep track of all files, even ones that are not actively read.
The order of parsers is configurable. So it is possible to parse JSON lines and then aggregate the contents into a multiline event.
Some position updates and metadata changes no longer depend on the publishing pipeline. If the pipeline is blocked some changes are still applied to the registry.
Only the most recent updates are serialized to the registry. In contrast, the log input has to serialize the complete registry on each ACK from the outputs. This makes the registry updates much quicker with this input.
The input ensures that only offsets updates are written to the registry append only log. The log writes the complete file state.
Stale entries can be removed from the registry, even if there is no active input.
To configure this input, specify a list of glob-based paths that must be crawled to locate and fetch the log lines.
Example configuration:
На выходе получаю это:
2021-11-29T09:42:47.980+0300 DEBUG [processors] processing/processors.go:203 Publish event: {
"@timestamp": "2021-11-29T06:42:47.980Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.15.2"
},
"subsystem": "GT",
"system": "xml",
"input": {
"type": "filestream"
},
"ecs": {
"version": "1.11.0"
},
"agent": {
"ephemeral_id": "15bb3ca0-a9c1-4740-894c-b986d310b515",
"id": "c19b2dd5-a45e-418b-8062-f5b0547cfab5",
"name": "xmlgate2",
"type": "filebeat",
"version": "7.15.2",
"hostname": "xmlgate2"
},
"log": {
"offset": 175,
"file": {
"path": "C:\\tmp\\GT\\xml\\2021-11-22\\log.log"
}
},
"message": "Use the filestream input to read lines from active log files. It is the new, improved alternative to the log input. It comes with various improvements to the existing input:",
"tags": [
"xml"
],
"host": {
"name": "xmlgate2"
}
}
2021-11-29T09:42:47.980+0300 DEBUG [processors] processing/processors.go:203 Publish event: {
"@timestamp": "2021-11-29T06:42:47.980Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.15.2"
},
"log": {
"file": {
"path": "C:\\tmp\\GT\\xml\\2021-11-22\\log.log"
},
"offset": 327
},
"message": "Checking of close_* options happens out of band. Thus, if an output is blocked, Filebeat can close the reader and avoid keeping too many files open.",
"input": {
"type": "filestream"
},
"host": {
"name": "xmlgate2"
},
"agent": {
"name": "xmlgate2",
"type": "filebeat",
"version": "7.15.2",
"hostname": "xmlgate2",
"ephemeral_id": "15bb3ca0-a9c1-4740-894c-b986d310b515",
"id": "c19b2dd5-a45e-418b-8062-f5b0547cfab5"
},
"tags": [
"xml"
],
"subsystem": "GT",
"system": "xml",
"ecs": {
"version": "1.11.0"
}
}
2021-11-29T09:42:47.980+0300 DEBUG [processors] processing/processors.go:203 Publish event: {
"@timestamp": "2021-11-29T06:42:47.980Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.15.2"
},
"input": {
"type": "filestream"
},
"subsystem": "GT",
"system": "xml",
"host": {
"name": "xmlgate2"
},
"log": {
"offset": 524,
"file": {
"path": "C:\\tmp\\GT\\xml\\2021-11-22\\log.log"
}
},
"tags": [
"xml"
],
"agent": {
"type": "filebeat",
"version": "7.15.2",
"hostname": "xmlgate2",
"ephemeral_id": "15bb3ca0-a9c1-4740-894c-b986d310b515",
"id": "c19b2dd5-a45e-418b-8062-f5b0547cfab5",
"name": "xmlgate2"
},
"message": "Detailed metrics are available for all files that match the paths configuration regardless of the harvester_limit. This way, you can keep track of all files, even ones that are not actively read.",
"ecs": {
"version": "1.11.0"
}
}
......