CSV::MalformedCSVError: Unquoted fields do not allow \r or \n

Hello,

I'm getting a CSV::MalformedCSVError exception for events where the (possible) multilined field has line breaks. I think I managed to join all the event's lines in one single event (at least that's what I understand, since the "message" seens complete with all the lines) using the multiline codec, but what I don't understand is why this "Unquoted fields do not allow \t or \n" is triggering, even though the field is quoted.

Thank you in advance for your help.

Pipeline:

input {
	file {
	path => "/home/ffknob/Entwicklung/workspace/dados-abertos-elk/data/tcers/decisoes/teste.csv"
    	start_position => "beginning"
		sincedb_path => "/dev/null"
		codec => multiline {
			pattern => "^%{YEAR},"
			negate => true 
			what => "previous"
			auto_flush_interval => 2
		}
	}
}
filter {
	csv {
		skip_header => true
		columns => ["ano_sessao","data_sessao","tipo_sessao","cod_orgao_julgador","nome_orgao_julgador","numero_sessao","nr_processo","cod_tipo_processo","tipo_processo","cod_orgao","nome_orgao","cod_gabinete","nome_gabinete","cod_magistrado","nome_magistrado","extra_pauta","retirado_pauta","solicitacao_vista","decisao","link_video_sessao"]
	}
	if [data_sessao] == "DATA_SESSAO" {
		drop { }
	}
}
output {
	stdout { codec => rubydebug }
}

Event output:

[2018-10-27T12:26:56,714][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"2017,01/02/2017,Ordinária,3,Tribunal Pleno,2,83300200130,37,Inspeção Extraordinária,55600,PM DE RIO GRANDE,22,Gabinete do Conselheiro Cezar Miola,35,Cezar Miola,Não,Não,Não,\"- Saneamento do feito, com o objetivo de fixar os exercícios de 2011 a 2015 como limite para a presente Inspeção Extraordinária.\n\",http://www1.tce.rs.gov.br/sessoes/2017/20170201_2_3_83300200130.mp4\r", :exception=>#<CSV::MalformedCSVError: Unquoted fields do not allow \r or \n (line 1).>}

{
       "message" => "2017,01/02/2017,Ordinária,3,Tribunal Pleno,2,83300200130,37,Inspeção Extraordinária,55600,PM DE RIO GRANDE,22,Gabinete do Conselheiro Cezar Miola,35,Cezar Miola,Não,Não,Não,\"- Saneamento do feito, com o objetivo de fixar os exercícios de 2011 a 2015 como limite para a presente Inspeção Extraordinária.\n\",http://www1.tce.rs.gov.br/sessoes/2017/20170201_2_3_83300200130.mp4\r",
    "@timestamp" => 2018-10-27T15:26:56.590Z,
      "@version" => "1",
          "tags" => [
        [0] "multiline",
        [1] "_csvparsefailure"
    ],
          "host" => "0.0.0.0",
}

I managed to pass the exception by removing "\n" before the csv filter takes in, but with this solution I lose all the line breaks inside the text. I guess I could replace "\n" for an exotic character before the csv filter, and then taking it back to "\n" after it breaks the event into fields... But is there any other (and more elegant way) to solve this?

mutate { gsub => [ "message", "\n", "" ] }

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.