We're having issues setting a date pulled from a xml document as @timestamp in the date filter.
input xml looks like this <Task> <TaskId>ServerTasks-5017</TaskId> <TaskState>Success</TaskState> <Created>2015-12-22T08:20:03</Created> <QueueTime>2015-12-22T08:20:03</QueueTime> <StartTime>2015-12-22T08:20:06</StartTime> <CompletedTime>2015-12-22T08:21:11</CompletedTime> <DurationSeconds>68</DurationSeconds> </Task>
Filer looks like this
filter { xml { source => "message" target => "xml_content" xpath => [ "//Created/text()", "created" ] } date { match => [ "created","yyyy-MM-dd'T'HH:mm:ss"] #match => ["xml_content.Created", "yyyy-MM-dd'T'HH:mm:ss"] } }
As you can see I've both tried to match the content pulled from the xml directly and with the xpath.
Both end up with _dateparsefailure and the @timestamp is set the the logstash time.
Any tips for a novice logstash user on how to resolve this?
I'm not XPath-fluent, but shouldn't the expression be //Task/Created/text()? What does the resulting event look like? Use a stdout { codec => rubydebug } output. Also, what's in the logs? When the date filter fails it tells you why in the log.
even setting 'force_array' to false gives xpath extracted element 'created' as array. Rest of the content that is not extracted with xpath is now not in array format.
Any tips? Can i reference the first element of the array 'created' somehow?
The syntax to access a field is [fieldname]. If you are referring to a top-level field, you can omit the and simply use fieldname. To refer to a nested field, you specify the full path to that field: [top-level field][nested field].
Replace xml_content.Created with xml_content[Created]
Works with your sample XML data in Logstash 5.2.1, with—as recommended by @magnusbaeck—force_array => false.
filter {
xml {
source => "message"
target => "@metadata[xml_content]"
force_array => false
}
# Copy XML content to first-level fields with all-lowercase names
ruby {
code => '
event.get("@metadata[xml_content]").each do |key, value|
event.set(key.downcase, value)
end
'
}
mutate {
remove_field => ["message", "@metadata"]
convert => {
"durationseconds" => "integer"
}
}
date {
match => ["created", "ISO8601"]
}
}
Notes:
@magnusbaeck: I thought @metadata wasn’t supposed to get passed through to the output, but it does get included in output to stdin and Elasticsearch. Hence its presence in remove_field. Did I miss a memo? (I’m using Logstash 5.2.1.)
If you can—if you are responsible for creating the original XML-format events—consider adding a zone designator to the time stamps. Otherwise, be sure that you understand the repercussions of specifying local times, and how those values might be interpreted.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.