Не работает multiline в Filebeat с filebeat.inputs: - type: filestream

Добрый день!
Я столкнулся с проблемой многострочной обработки в Filebeat, когда в параметрах filebeat.inputs: указано type: filestream - журналы файлового потока не анализируются в соответствии с требованиями multiline.pattern: '^[[0-9]{4}-[0-9]{2}-[0-9]{2}', на выходе я вижу, что создаются однострочные сообщения с отдельными строками из файла журнала, ожидалось что на выходе я получу многострочные сообщения являющиеся частью одного события.

Если в параметрах filebeat.inputs: указать type: log, то все работает правильно, в соответствии с требованиями multiline.pattern: '^[[0-9]{4}-[0-9]{2}-[0-9]{2}'- создаются многострочное сообщение.

Что не правильно указано в моем конфиге?


filebeat.inputs:
- type: filestream
  enabled: true
  paths:
    - C:\tmp\GT\xml\*\*.log
  fields_under_root: true
  fields:
    system: xml
    subsystem: GT
  multiline.type: pattern
  multiline.pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: false
  multiline.match: after

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml

  reload.enabled: false

 setup.template.settings:
  index.number_of_shards: 1

tags: [xml]

output.file:
  enabled: true
  path: "C:/tmp/output/"
  filename: filebeat

logging:
  to_files: true
  files:
    path: C:/ProgramData/filebeat/Logs
  level: debug
  permissions: 0644


содержимое файла журнала:

Use the filestream input to read lines from active log files. It is the new, improved alternative to the log input. It comes with various improvements to the existing input:

Checking of close_* options happens out of band. Thus, if an output is blocked, Filebeat can close the reader and avoid keeping too many files open.
Detailed metrics are available for all files that match the paths configuration regardless of the harvester_limit. This way, you can keep track of all files, even ones that are not actively read.
The order of parsers is configurable. So it is possible to parse JSON lines and then aggregate the contents into a multiline event.
Some position updates and metadata changes no longer depend on the publishing pipeline. If the pipeline is blocked some changes are still applied to the registry.
Only the most recent updates are serialized to the registry. In contrast, the log input has to serialize the complete registry on each ACK from the outputs. This makes the registry updates much quicker with this input.
The input ensures that only offsets updates are written to the registry append only log. The log writes the complete file state.
Stale entries can be removed from the registry, even if there is no active input.
To configure this input, specify a list of glob-based paths that must be crawled to locate and fetch the log lines.

Example configuration:

На выходе получаю это:

2021-11-29T09:42:47.980+0300	DEBUG	[processors]	processing/processors.go:203	Publish event: {
  "@timestamp": "2021-11-29T06:42:47.980Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.15.2"
  },
  "subsystem": "GT",
  "system": "xml",
  "input": {
    "type": "filestream"
  },
  "ecs": {
    "version": "1.11.0"
  },
  "agent": {
    "ephemeral_id": "15bb3ca0-a9c1-4740-894c-b986d310b515",
    "id": "c19b2dd5-a45e-418b-8062-f5b0547cfab5",
    "name": "xmlgate2",
    "type": "filebeat",
    "version": "7.15.2",
    "hostname": "xmlgate2"
  },
  "log": {
    "offset": 175,
    "file": {
      "path": "C:\\tmp\\GT\\xml\\2021-11-22\\log.log"
    }
  },
  "message": "Use the filestream input to read lines from active log files. It is the new, improved alternative to the log input. It comes with various improvements to the existing input:",
  "tags": [
    "xml"
  ],
  "host": {
    "name": "xmlgate2"
  }
}
2021-11-29T09:42:47.980+0300	DEBUG	[processors]	processing/processors.go:203	Publish event: {
  "@timestamp": "2021-11-29T06:42:47.980Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.15.2"
  },
  "log": {
    "file": {
      "path": "C:\\tmp\\GT\\xml\\2021-11-22\\log.log"
    },
    "offset": 327
  },
  "message": "Checking of close_* options happens out of band. Thus, if an output is blocked, Filebeat can close the reader and avoid keeping too many files open.",
  "input": {
    "type": "filestream"
  },
  "host": {
    "name": "xmlgate2"
  },
  "agent": {
    "name": "xmlgate2",
    "type": "filebeat",
    "version": "7.15.2",
    "hostname": "xmlgate2",
    "ephemeral_id": "15bb3ca0-a9c1-4740-894c-b986d310b515",
    "id": "c19b2dd5-a45e-418b-8062-f5b0547cfab5"
  },
  "tags": [
    "xml"
  ],
  "subsystem": "GT",
  "system": "xml",
  "ecs": {
    "version": "1.11.0"
  }
}
2021-11-29T09:42:47.980+0300	DEBUG	[processors]	processing/processors.go:203	Publish event: {
  "@timestamp": "2021-11-29T06:42:47.980Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.15.2"
  },
  "input": {
    "type": "filestream"
  },
  "subsystem": "GT",
  "system": "xml",
  "host": {
    "name": "xmlgate2"
  },
  "log": {
    "offset": 524,
    "file": {
      "path": "C:\\tmp\\GT\\xml\\2021-11-22\\log.log"
    }
  },
  "tags": [
    "xml"
  ],
  "agent": {
    "type": "filebeat",
    "version": "7.15.2",
    "hostname": "xmlgate2",
    "ephemeral_id": "15bb3ca0-a9c1-4740-894c-b986d310b515",
    "id": "c19b2dd5-a45e-418b-8062-f5b0547cfab5",
    "name": "xmlgate2"
  },
  "message": "Detailed metrics are available for all files that match the paths configuration regardless of the harvester_limit. This way, you can keep track of all files, even ones that are not actively read.",
  "ecs": {
    "version": "1.11.0"
  }
}
......

Нашлось решение, в конфиге надо указать параметр parsers:

  parsers:
    - multiline:
        type: pattern
        pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
        negate: true
        match: after

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.