Problem with xml filter

Try this filter:

filter {
    xml {
        source => "message"
        target => "@metadata[xml_content]"
        force_array => false
    }
    # Copy XML content to first-level fields with all-lowercase names
    ruby {
        code => '
            event.get("@metadata[xml_content]").each do |key, value|
                event.set(key.downcase, value)
            end
        '
    }
    mutate {
         remove_field => ["message", "@metadata"]
         convert => {
           "durationseconds" => "integer"
         }
    }
    date {
        match => ["created", "ISO8601"]
    }
}

Notes:

  • @magnusbaeck: I thought @metadata wasn’t supposed to get passed through to the output, but it does get included in output to stdin and Elasticsearch. Hence its presence in remove_field. Did I miss a memo? (I’m using Logstash 5.2.1.)
  • That Ruby code is a workaround for the issue I describe in “Set target of xml filter to root?
  • If you want to preserve the original case of the XML element names, remove .downcase from key.downcase

Example Logstash output

In JSON format:

"@timestamp": "2015-12-22T08:20:03.000Z",
"completedtime": "2015-12-22T08:21:11",
"created": "2015-12-22T08:20:03",
"@version": "1",
"host": "58a3fe88f636",
"starttime": "2015-12-22T08:20:06",
"durationseconds": 68,
"taskid": "ServerTasks-5017",
"queuetime": "2015-12-22T08:20:03",
"taskstate": "Success"

More unsolicited tips

If you can—if you are responsible for creating the original XML-format events—consider adding a zone designator to the time stamps. Otherwise, be sure that you understand the repercussions of specifying local times, and how those values might be interpreted.

2 Likes