Parsing csv file which has some field/column containing separator as values

I have following type of entries in csv file:

16 , 5 , 53 , Logging/LoggingError , 2, error while opening file "a,b,c,d": , 10.10.4.219 , serviceA

I am trying to parse it using csv pluging as follows:

csv {

columns => ["time","split",event","event_name","level","error_msg","ip","service"]
separator => ","

}

Now problem is i have used ","(comma) as separator but in my error_msg column there are some "," present so logstash is parsing the values thinking as separator. In actual log file there are different error_msg with different number of ","(commas).

So please help me how i can solve this problem!!

Thanks in advance :slight_smile:

can you show the actual raw event as it is in the log file you're trying to parse?

perhaps mutate=>gsub could help by tidying up the event before sending it to csv filter....

That line of data isn't properly formatted for a csv.

You have an unescaped comma and unescaped double quote in the text. So if you use a text qualifier, it will fail because of the unescaped double quote. If you don't use a text qualifier, it will fail on the unescaped comma. There is no real solution without changing the data.

Ideally that line should look like:

16 , 5 , 53 , "Logging/LoggingError" , 2, "error while opening file ""a,b,c,d"":" , "10.10.4.219" , "serviceA"

By doubling up the double quotes you are telling the CSV parser to escape it. But if you can't change that source data you may be out of luck.

Thank u all for suggestions.

I used mutate =>gsub to replace the ","(comma) present inside the column with ""(blank).
Now its parsing correctly for single event using:
input {
stdin {
type => "application1"}
}

But for reading complete file the fields are distributed across multiline .....
So how can i use multiline pattern for a csv file.

I could not find anything for multiline csv files!!!
Please suggest some solution for multiline csv file