I know theres a lot of the same topics because i was trying to figure out from these how to parse logs from xml to the Elastic. What im trying to do is:
It should be in one log message and in separetly brackets in log. For example fieldxml.date, fieldxml.millis. I was trying to figure out with chatgpt but that even cant help me. I tried something like this:
The code you initialy posted would be the one for a logstash pipeline, not filebeat or even elasticsearch.
Usually the data flow is the following :
filebeat -> elasticsearch
filebeat -> logstash -> elasticsearch
logstash -> elasticsearch
Filebeat (beats) is a "standalone" binary deployed on the host where you want to collect data while logstash can indead collect logs but requires JVM to run.
Once collected, the events are sent to elasticsearch and, if requested, an ingest pipeline will be executed on your event during ingestion resulting in a new document within your target index.
So in your case, since you mentioned filebeat, I assume you plan using filebeat to access the logs, then send it to logstash or directly elasticsearch.
Then to process XML log formated data with filebeat, you can indeed use multiline to extract as message your complete xml entry and with the decode_xml processor from filebeat or logstash xml filter, parse your "message" entry to an actual xml.
filebeat.inputs:
- type: filestream
id: my-filestream-id
paths:
- /opt/path/to/my/xml.log
parsers:
- multiline:
type: pattern
pattern: '^<record>'
negate: true
match: after
# xml conversion can be within the same filebeat.yaml or handed over to logstash.
# if in the same :
processors:
- decode_xml:
field: message
target_field: "record"
overwrite_keys: true
# any output (here logstash is not necessary)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.