Hi,
I have following csv file.
"@timestamp";"col_a";"col_b";"col_c";"col_d"
"2019-08-23 04:43:16.821";"<?xml version=\"1.0\" encoding=\"UTF-8\"?>";b;c;d
I am trying to parse it in logstash with following filter:
filter
{
csv
{
autodetect_column_names => true
autogenerate_column_names => true
separator => ";"
source => "message"
skip_empty_columns => "true"
target=> "mycsv"
}
}
2nd line is throwing following error:
"2019-08-23 04:43:16.821";"<?xml version=\"1.0\" encoding=\"UTF-8\"?>";b;c;d
[2019-08-23T12:02:38,204][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"\"2019-08-23 04:43:16.821\";\"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\"?>\";b;c;d\r", :exception=>#<CSV::MalformedCSVError: Missing or stray quote in line 1>}
[2019-08-23T12:02:38,208][INFO ][logstash.outputs.file ] Opening file {:path=>"c:/work/elastic/input/myoutput.json"}
[2019-08-23T12:02:38,216][INFO ][logstash.outputs.file ] Opening file {:path=>"c:/work/elastic/input/myoutput.log"}
{
"@timestamp" => 2019-08-23T10:02:38.099Z,
"host" => "dtpbl0319",
"@version" => "1",
"message" => "\"2019-08-23 04:43:16.821\";\"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\"?>\";b;c;d\r",
"tags" => [
[0] "_csvparsefailure"
]
}
Changing the quote_char
is not an option to me. Input is coming from logs where the code can contain single or double quotes.
Is there a way csv filter can deal with masked quote_chars, so that it handles them as normal text?
Background: I am exporting results of queries via kibana's csv export from customer. Now I am trying to import the results back to an independant elasticsearch instance for our developers.
Thanks, Andreas