How to parse XML?

I'm trying to parse the following xml from an api through xml plugin.

<Journey fpTime="12:33" fpDate="02.11.19" delay="0" e_delay="0" platform="102" targetLoc="abc" dirnr="8001055" prod="S     60#S" dir="Böblingen" administration="800643" depStation="Stuttgart Hbf &#x0028;tief&#x0029;" is_reachable="0" delayReason=" " approxDelay="0"></Journey>
<Journey fpTime="12:33" fpDate="02.11.19" delay="0" e_delay="0" platform="101" targetLoc="Stuttgart Schwabstr." dirnr="8006698" prod="S      5#S" dir="Stuttgart Schwabstr." administration="800643" depStation="Stuttgart Hbf &#x0028;tief&#x0029;" is_reachable="0" delayReason=" " approxDelay="0"></Journey>

My config file looks like this:

input {
  http_poller {
    urls => {
      test2 => {
        # Supports all options supported by ruby's Manticore HTTP client
        method => get
         url => ""
        headers => {
          Accept => "application/json"
    request_timeout => 20
    # Supports "cron", "every", "at" and "in" schedules by rufus scheduler
    schedule => { cron => "* * * * * UTC"}
	codec => "plain"
    # A hash of request metadata info (timing, response headers, etc.) will be sent here
    metadata_target => "http_poller_metadata"
filter {

  ## interpret the message payload as XML
  xml {
    source => "message"
    target => "parsed"
	store_xml => true 
	force_array => false 

    split {
    field => "[parsed][Journey]"
    add_field => {
      ## generate a unique id for the station # X the sensor time to prevent duplicates
      id                  => "%{[parsed][Journey][fpTime]}-%{[parsed][Journey][fpDate]}-%{[parsed][Journey][dirnr]"
      targetStationName                => "%{[parsed][Journey][targetLoc]}"
	  time			=> "%{[parsed][Journey][fpTime]}"
	  dir => "%{[parsed][Journey][dirnr]}"
      jDate  => "%{[parsed][Journey][fpDate]}"
      e_delay                 => "%{[parsed][Journey][e_delay]}"
      depStation                => "%{[parsed][Journey][depStation]}"
      delayReason             => "%{[parsed][Journey][delayReason]}"
      administration        => "%{[parsed][Journey][administration]}"
	  prod	=> "%{[parsed][Journey][prod]}"
	  platform	=> "%{[parsed][Journey][platform]}"
    mutate {
    ## Convert the numeric fileds to the appropriate data type from strings
    convert => {
      "e_delay"  => "integer"
    ## put the geospatial value in the correct [ longitude, latitude ] format
    add_field => { "fullDate" => [ "%{[date]}", "%{[time]}" ]}
    ## get rid of the extra fields we don't need
    remove_field => [ "message", "parsed", "http_poller_metadata"]
  ## use the embedded Unix timestamp 
 date {
    match => ["fullDate", "UNIX_MS"]
    remove_field => ["jDate"]

I want to parse each element in <Journey> seperatley but I get the the error message :exception=>#<REXML::ParseException: missing attribute quote
Line: 1
Position: 6750
Last 80 unconsumed characters:

If you supplied the header 'Accept => "application/json"' I would be surprised if the API returned XML.

If it does return the XML you showed then there will not be a [parsed][Journey] field, so your split filter does nothing. "%{[parsed][Journey][delayReason]}" should be "%{[parsed][delayReason]}" etc.

None of that explains the missing attribute quote error.

I could fix the error now my config looks like

input {
  http_poller {
    urls => {
      test2 => {
        method => get
         url => ""

	 codec => multiline
            pattern => "<Journey"
            negate => true
            what => "previous"
			charset => "ISO-8859-1"
			auto_flush_interval => 1
    request_timeout => 20
    schedule => { cron => "* * * * * UTC"}
    # A hash of request metadata info (timing, response headers, etc.) will be sent here
    metadata_target => "http_poller_metadata"
filter {

 xml {
    remove_namespaces => true
    source => "message"
	force_array => false
	target => "msg"
	xpath => ["/Journey/@fpTime", "jTime"]

When I run Logstash i get the following error. Do you know how to fix this?

exception=>#<REXML::ParseException: #<RuntimeError: attempted adding second root element to document>Line: 1
Position: 7305

I believe that is what you get if you try to parse something like


An xml filter will only work if the input is something like


And which type of filter can I use instead?

Use mutate+gsub to add an element wrapping both of the existing elements.

Ok, and how can I add an element at the beginning and at the end position of the message.
filter {
mutate { gsub => [ "message", "Start of Message", "" ] }

Use anchors

mutate { gsub => [ "message", "^", "<a>", "message", "$", "</a>" ] }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.