Hi,
I have following csv file.
"@timestamp";"col_a";"col_b";"col_c";"col_d"
"2019-08-23 04:43:16.821";"<?xml version=\"1.0\" encoding=\"UTF-8\"?>";b;c;d
I am trying to parse it in logstash with following filter:
filter
{
  csv
  {
    autodetect_column_names => true
    autogenerate_column_names => true
    separator => ";"
    source => "message"
    skip_empty_columns => "true"
    target=> "mycsv"
  }
}
2nd line is throwing following error:
"2019-08-23 04:43:16.821";"<?xml version=\"1.0\" encoding=\"UTF-8\"?>";b;c;d
[2019-08-23T12:02:38,204][WARN ][logstash.filters.csv     ] Error parsing csv {:field=>"message", :source=>"\"2019-08-23 04:43:16.821\";\"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\"?>\";b;c;d\r", :exception=>#<CSV::MalformedCSVError: Missing or stray quote in line 1>}
[2019-08-23T12:02:38,208][INFO ][logstash.outputs.file    ] Opening file {:path=>"c:/work/elastic/input/myoutput.json"}
[2019-08-23T12:02:38,216][INFO ][logstash.outputs.file    ] Opening file {:path=>"c:/work/elastic/input/myoutput.log"}
{
    "@timestamp" => 2019-08-23T10:02:38.099Z,
          "host" => "dtpbl0319",
      "@version" => "1",
       "message" => "\"2019-08-23 04:43:16.821\";\"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\"?>\";b;c;d\r",
          "tags" => [
        [0] "_csvparsefailure"
    ]
}
Changing the quote_char is not an option to me. Input is coming from logs where the code can contain single or double quotes.
Is there a way csv filter can deal with masked quote_chars, so that it handles them as normal text?
Background: I am exporting results of queries via kibana's csv export from customer. Now I am trying to import the results back to an independant elasticsearch instance for our developers.
Thanks, Andreas