CSV Filter - Quote character causing _csvparsefailure


(David Perry) #1

The context is logstash and bro http logs.

Both logstash and bro are minimally configured, just enough to test. Bro is generating tsv files and logstash is using the csv filter to process.

In testing, logstash ocasionally throws _csvparsefailure errors. Through the process of elimination, I have determined that a double-quote character embedded in a field triggers the _csvparsefailure. The parse error occurs ONLY and ALWAYS when a Bro field contains a double-quote. Single quotes are fine.

The logstash message field shows what looks like correctly escaped quotes:
&_cvar={"1":["service","fqdn"]}

Bro has this:
&_cvar=%7B%221%22%3A%5B%22service%22%2C%22fqdn%22%5D%7D

Aside from modifying the input data (which just seems wrong), is there another solution/work-around? I have tried explicitly setting quote_char to something else, but I have not found a valid character that does not occur in some log entry.

Thanks,
David


(Makara) #2

Hi
@perry29

You can replace the double-quotes with a blank space in the logstash filter section for each message like-

mutate {
            gsub => ["message", "\"", "  "]
       }

(Chander Mohan) #3

@Makra

Curious to know where we place, this code. Reason of asking this question is:-
If we place the code in filter section after mentioning the column(refer my below my logstash.config), it throws an error.

Can we place it before column field ? if yes, please share the sample.

Piece of my logstash.config file
filter{

   csv {
 	separator => ","
#	skip_empty_columns => true
	columns => [ "Month", "Quarter", "Year", "INCIDENT_ID" , "RESOLUTION",  "SUMMARY"]
    }
	mutate {convert => ["Month", "integer"]}
	mutate {convert => ["Quarter", "integer"]}
  	mutate {convert => ["Year", "integer"]}
  	mutate {gsub => ["RESOLUTION", "['"\\]","0"]
	mutate {gsub => ["SUMMARY", "['"\\]","0"]
}

(Makara) #4

Call mutate { gsub => .... } before

csv {

}

So that each event from the CSV file get substitution ( double quotes with single quotes or whatever you like)


(Chander Mohan) #5

Thanks for the quick response @Makra...

As I was expecting .. If we use it before the column names.. Logstash doesn't recognize the column names it wasn't pass when program was reading instruction for mutate.

    filter{
	mutate {gsub => ["RESOLUTION", "['"\\]","0"]
	mutate {gsub => ["SUMMARY", "['"\\]","0"]
   csv {
 	separator => ","
	skip_empty_columns => true
	columns => [ "Month", "Quarter", "Year", "INCIDENT_ID", "REQ_ID", "COUNTRY", "SERVICE", "ASSIGNED_GROUP", "STATUS", "STATUS_REASON", "SERVICE_TYPE","PRIORITY", "URGENCY", "IMPACT", "REPORTED_SOURCE" ]

// Error I encountered with, it exactly pointed where mutant command get started.

[2018-02-25T22:22:09,455][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2018-02-25T22:22:09,698][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, {, ,, ] at line 10, column 37 (byte 216) after filter{\n\tmutate {gsub => [\"RESOLUTION\", \"['\"", :backtrace=>["C:/Users/1480587/Documents

(David Perry) #6

Thanks for the specific examples, @Makra. So very helpful for a beginner. I will try this as a work-around and hope that the root problem will get fixed. I do not like losing the data (the quote characters) in the original message, but accept that I don't have much choice at this point.

David


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.