Parse csv with variable number of columns

What would be the best way to parse simple file
with the structure where second column "optional_retry_count" sometime exist sometime not.

message_type, optional_retry_count, date, xml_data

EURec,20190121150448,<?xml version="1.0"....
EUSlc,2,20190121150458,<?xml version="1.0"....
EUNot,3,20190121150468,<?xml version="1.0"....
EUSet,20190121150488,<?xml version="1.0"....
EURec,0,20190121150498,<?xml version="1.0"....

using:

filter {
grok {
match => { "message" => "%{DATA:type},%{DATA:date},%{GREEDYDATA:xmldata}" }
}

Which is the bests way to decode optional 2nd field Retry

%{DATA:retry}

should I use Grok, Dissect, csv and how to incorporate the condition?

expected result:

{
"date": "20190121150458",
"type": "EUSlc",
"xmldata": "<?xml version="1.0"....\r",
"retry": "2"
}

Thank you

A csv filter will not like the XML data, so I would use dissect.

if [message] =~ /^[^,]+,[^,]+,[^,]+,<\?xml/ {
    dissect { mapping => { "message" => "%{message_type},%{retry_count},%{date},%{xml_data}" } }
} else if [message] =~ /^[^,]+,[^,]+,<\?xml/ {
    dissect { mapping => { "message" => "%{message_type},%{date},%{xml_data}" } }
} else {
    mutate { add_tag => [ "somethingElse" ] }
}
2 Likes

Thank you this solved my problem.
excellent fast answer

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.