No idea how to setup pipeline for the xml parsing

Nevov21 · November 8, 2024, 6:42pm

Hello,

I know theres a lot of the same topics because i was trying to figure out from these how to parse logs from xml to the Elastic. What im trying to do is:

<record>
  <date>2024-11-08T18:32:56.379787054Z</date>
  <millis>1731090776379</millis>
  <nanos>787054</nanos>
  <sequence>102</sequence>
  <logger>org.forgerock.openidm.relationship.SignalPropagationCalculatorFactory</logger>
  <level>INFO</level>
  <class>org.forgerock.openidm.relationship.SignalPropagationCalculatorFactory</class>
  <method>getSignalPropagationCalculator</method>
  <thread>15</thread>
  <message>Smart-signaling disabled: false</message>
</record>

It should be in one log message and in separetly brackets in log. For example fieldxml.date, fieldxml.millis. I was trying to figure out with chatgpt but that even cant help me. I tried something like this:

filter {
  if "xml" in [log][file][path] {
    mutate {
      replace => { "message" => "<record>%{message}</record>" }
    }

    xml {
      source => "message"
      target => "parsed_xml"
      store_xml => true
      force_array => false
    }

    mutate {
      rename => { "[parsed_xml][date]" => "date" }
      rename => { "[parsed_xml][millis]" => "millis" }
      rename => { "[parsed_xml][nanos]" => "nanos" }
      rename => { "[parsed_xml][sequence]" => "sequence" }
      rename => { "[parsed_xml][logger]" => "logger" }
      rename => { "[parsed_xml][level]" => "level" }
      rename => { "[parsed_xml][class]" => "class" }
      rename => { "[parsed_xml][method]" => "method" }
      rename => { "[parsed_xml][thread]" => "thread" }
      rename => { "[parsed_xml][message]" => "log_message" }
    }

    mutate {
      remove_field => ["message", "parsed_xml"]
    }
  }
}

Thanks for helping!

rugenl · November 8, 2024, 7:10pm

Use multiline to merge the event.

Nevov21 · November 9, 2024, 5:50pm

Okey, but where i should put that multiline? In filebeat.yml or in mine pipeline on elastic?

gpineda_dev · November 10, 2024, 12:19am

The code you initialy posted would be the one for a logstash pipeline, not filebeat or even elasticsearch.

Usually the data flow is the following :

filebeat -> elasticsearch
filebeat -> logstash -> elasticsearch
logstash -> elasticsearch

Filebeat (beats) is a "standalone" binary deployed on the host where you want to collect data while logstash can indead collect logs but requires JVM to run.

Once collected, the events are sent to elasticsearch and, if requested, an ingest pipeline will be executed on your event during ingestion resulting in a new document within your target index.

So in your case, since you mentioned filebeat, I assume you plan using filebeat to access the logs, then send it to logstash or directly elasticsearch.

Then to process XML log formated data with filebeat, you can indeed use multiline to extract as message your complete xml entry and with the decode_xml processor from filebeat or logstash xml filter, parse your "message" entry to an actual xml.

filebeat.inputs:
- type: filestream
  id: my-filestream-id
  paths:
    - /opt/path/to/my/xml.log
  parsers:
    - multiline:
         type: pattern
         pattern: '^<record>'
         negate: true
         match: after

# xml conversion can be within the same filebeat.yaml or handed over to logstash.
# if in the same :
processors:
  - decode_xml:
      field: message
      target_field: "record"
      overwrite_keys: true

# any output (here logstash is not necessary)

Here is a similar thread about it

PS: This is my first post on the platform, so not sure if details are sufficient.

rugenl · November 10, 2024, 2:29am

I don’t have access to the config I implemented. I used a he new agent, that’s equivalent to filebeat. The config above looks like what I remember.

Topic		Replies	Views
XML parsing logstash Logstash	4	461	September 10, 2018
Struggling to parse XML using Logstash Logstash	1	279	October 13, 2020
Logstash XML parsing problem Logstash	3	1354	January 18, 2018
Need help to parse XML log in logstash Logstash	6	324	August 8, 2018
How to parse xml.log via logstash Logstash	4	279	October 7, 2020

No idea how to setup pipeline for the xml parsing

Related topics