The code you initialy posted would be the one for a logstash pipeline, not filebeat or even elasticsearch.
Usually the data flow is the following :
- filebeat -> elasticsearch
- filebeat -> logstash -> elasticsearch
- logstash -> elasticsearch
Filebeat (beats) is a "standalone" binary deployed on the host where you want to collect data while logstash can indead collect logs but requires JVM to run.
Once collected, the events are sent to elasticsearch and, if requested, an ingest pipeline will be executed on your event during ingestion resulting in a new document within your target index.
So in your case, since you mentioned filebeat, I assume you plan using filebeat to access the logs, then send it to logstash or directly elasticsearch.
Then to process XML log formated data with filebeat, you can indeed use multiline to extract as message your complete xml entry and with the decode_xml processor from filebeat or logstash xml filter, parse your "message" entry to an actual xml.
filebeat.inputs:
- type: filestream
id: my-filestream-id
paths:
- /opt/path/to/my/xml.log
parsers:
- multiline:
type: pattern
pattern: '^<record>'
negate: true
match: after
# xml conversion can be within the same filebeat.yaml or handed over to logstash.
# if in the same :
processors:
- decode_xml:
field: message
target_field: "record"
overwrite_keys: true
# any output (here logstash is not necessary)
Here is a similar thread about it
PS: This is my first post on the platform, so not sure if details are sufficient.