CSV Illegal Quoting in Line 1

I'm receiving the following error & some of my incoming logs are being not being parsed and therefore dropped.

[2021-11-22T09:50:11,381][WARN ][logstash.filters.csv     ][MID-CUCM][6a45eeb30e5ba587e03ab6fc5b123b705e6e76c0996435d0b427c2741d24c7f8] Error parsing csv {:field=>"message", :source=>"C8A5A0662CEF69\",3,\"00000000005B43B9021E173B00000000\",0,0,0,\"\",\"\",2,\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",0,0,\"\",\"\",\"\",\"\",\"\",\"\",\"\",\"\",0,0,0,\"5634\",\"5634\",\"5634\",\"\"", :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}

Here's that same line broken out a bit:

[2021-11-22T09:50:11,381][WARN ][logstash.filters.csv     ][MID-CUCM][6a45eeb30e5ba587e03ab6fc5b123b705e6e76c0996435d0b427c2741d24c7f8] 
Error parsing csv 
{:field=>"message",
 :source=>"C8A5A0662CEF69\",
3,
\"00000000005B43B9021E173B00000000\",
0,
0,
0,
\"\",
\"\",
2,
\"\",
\"\",
\"\",
\"\",
\"\",
\"\",
\"\",
\"\",
0,
0,
\"\",
\"\",
\"\",
\"\",
\"\",
\"\",
\"\",
\"\",
0,
0,
0,
\"5634\",
\"5634\",
\"5634\",
\"\"",

 :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}

Here's my config file:

input {
        file {
                path =>"/sftp/etcsftp/incoming/cdr*"
                type => "cucm-cdr"
        add_field => { "cucm_clustername" => "CUCM_Cluster_1" }
        }    
}

filter {
        if [type] == "cucm-cdr"{
                if ([message] =~ /^\"cdrRecordType/) {
                        drop{}
                }
                if ([message] =~ /^INTEGER/) {
                        drop{}
                }
                csv {
                        columns => ["cdrRecordType","globalCallID_callManagerId","globalCallID_callId","origLegCallIdentifier","dateTimeOrigination","origNodeId","origSpan","origIpAddr","callingPartyNumber","callingPartyUnicodeLoginUserID","origCause_location","origCause_value","origPrecedenceLevel","origMediaTransportAddress_IP","origMediaTransportAddress_Port","origMediaCap_payloadCapability","origMediaCap_maxFramesPerPacket","origMediaCap_g723BitRate","origVideoCap_Codec","origVideoCap_Bandwidth","origVideoCap_Resolution","origVideoTransportAddress_IP","origVideoTransportAddress_Port","origRSVPAudioStat","origRSVPVideoStat","destLegIdentifier","destNodeId","destSpan","destIpAddr","originalCalledPartyNumber","finalCalledPartyNumber","finalCalledPartyUnicodeLoginUserID","destCause_location","destCause_value","destPrecedenceLevel","destMediaTransportAddress_IP","destMediaTransportAddress_Port","destMediaCap_payloadCapability","destMediaCap_maxFramesPerPacket","destMediaCap_g723BitRate","destVideoCap_Codec","destVideoCap_Bandwidth","destVideoCap_Resolution","destVideoTransportAddress_IP","destVideoTransportAddress_Port","destRSVPAudioStat","destRSVPVideoStat","dateTimeConnect","dateTimeDisconnect","lastRedirectDn","pkid","originalCalledPartyNumberPartition","callingPartyNumberPartition","finalCalledPartyNumberPartition","lastRedirectDnPartition","duration","origDeviceName","destDeviceName","origCallTerminationOnBehalfOf","destCallTerminationOnBehalfOf","origCalledPartyRedirectOnBehalfOf","lastRedirectRedirectOnBehalfOf","origCalledPartyRedirectReason","lastRedirectRedirectReason","destConversationId","globalCallId_ClusterID","joinOnBehalfOf","comment","authCodeDescription","authorizationLevel","clientMatterCode","origDTMFMethod","destDTMFMethod","callSecuredStatus","origConversationId","origMediaCap_Bandwidth","destMediaCap_Bandwidth","authorizationCodeValue","outpulsedCallingPartyNumber","outpulsedCalledPartyNumber","origIpv4v6Addr","destIpv4v6Addr","origVideoCap_Codec_Channel2","origVideoCap_Bandwidth_Channel2","origVideoCap_Resolution_Channel2","origVideoTransportAddress_IP_Channel2","origVideoTransportAddress_Port_Channel2","origVideoChannel_Role_Channel2","destVideoCap_Codec_Channel2","destVideoCap_Bandwidth_Channel2","destVideoCap_Resolution_Channel2","destVideoTransportAddress_IP_Channel2","destVideoTransportAddress_Port_Channel2","destVideoChannel_Role_Channel2","IncomingProtocolID","IncomingProtocolCallRef","OutgoingProtocolID","OutgoingProtocolCallRef","currentRoutingReason","origRoutingReason","lastRedirectingRoutingReason","huntPilotPartition","huntPilotDN","calledPartyPatternUsage","IncomingICID","IncomingOrigIOI","IncomingTermIOI","OutgoingICID","OutgoingOrigIOI","OutgoingTermIOI","outpulsedOriginalCalledPartyNumber","outpulsedLastRedirectingNumber","wasCallQueued","totalWaitTimeInQueue","callingPartyNumber_uri","originalCalledPartyNumber_uri","finalCalledPartyNumber_uri","lastRedirectDn_uri"]
                        separator => ","
                }
                if "_csvparsefailure" in [tags] {
                drop{}
        }
        if "_dateparsefailure" in [tags] {
                drop{}
        }
        mutate {
            convert => [ "duration", "integer" ]
        }
        
        mutate {
                                add_field => ["dateTimeImport", "%{@timestamp}"]
                }


                date {
                        match => ["dateTimeOrigination", "UNIX"]
                }

                date {
                        match => ["dateTimeOrigination", "UNIX"]
                        target => "dateTimeOrigination_formatted"
                }

                date {
                        match => ["dateTimeConnect", "UNIX"]
                        target => "dateTimeConnect_formatted"
                }

                date {
                        match => ["dateTimeDisconnect", "UNIX"]
                        target => "dateTimeDisconnect_formatted"
                }

        
        translate {
                        source => "destMediaCap_payloadCapability"
                        target => "destMediaCap_payloadCapability_text"
                        dictionary => [ 
                                "0","NoCodec",
                                "1","NonStandard",
                                "2","G711Alaw 64k",
                                "3","G711Alaw 56k",
                                "4","G711mu-law 64k",
                                "5","G711mu-law 56k",
                                "6","G722 64k",
                                "7","G722 56k",
                                "8","G722 48k",
                                "9","G7231",
                                "10","G728",
                                "11","G729",
                                "12","G729AnnexA",
                                "13","Is11172AudioCap",
                                "14","Is13818AudioCap",
                                "15","G.729AnnexB",
                                "16","G.729 Annex AwAnnexB",
                                "18","GSM Full Rate",
                                "19","GSM Half Rate",
                                "20","GSM Enhanced Full Rate",
                                "25","Wideband 256K",
                                "32","Data 64k",
                                "33","Data 56k",
                                "40","G7221 32K",
                                "41","G7221 24K",
                                "42","AAC",
                                "80","GSM",
                                "81","ActiveVoice",
                                "82","G726_32K",
                                "83","G726_24K",
                                "84","G726_16K",
                                "86","iLBC",
                                "89","iSAC",
                                "100","H261",
                                "101","H263",
                                "102","Vieo",
                                "103","H264",
                                "106","H224"
            ]
        }

    }
      translate {
                        source => "destDeviceName"
                        target => "destDeviceName_Friendly"
                        dictionary_path => "/opt/logstash/patterns/mid-cucm_destdevicenamefriendly.yaml"
                        fallback => "Other Phone Answered"
                        refresh_interval => 300
                  }

        translate {
                        source => "finalCalledPartyNumber"
                        target => "finalCalledPartyNumber_Friendly"
                        dictionary_path => "/opt/logstash/patterns/mid-cucm_finalcalledpartynumber_friendly.yaml"
                        fallback => "Other Phone Answered"
                        refresh_interval => 300
                  }
}
output {
   elasticsearch { 
    hosts => ""
    user => ""
    password => ""
    ilm_rollover_alias => "network-cisco-cucm-mid"
    ilm_pattern => "001"
    ilm_policy => "mid-cucm-ilm"
      }
}

Hi,

Can you show a sample of your incoming csv data ?

It looks like the parser is having troube with illegal characters, maybe we can fix this.

So I'm in the process of moving these logs to our elastic cloud and our on-prem version of the conf file works. I will put that below as well.

1,2,5981253,35527805,1637596724,2,35527805,95174559,"819133757527","",0,16,4,111951775,11782,4,20,0,0,0,0,0,0,"0","0",35527821,2,0,1178535946,"5280","5280","xxx",0,0,4,1178535946,24584,4,20,0,0,0,0,0,0,"0","0",1637596724,1637597207,"1871","17b1fafc-c36f-4cbd-b1e1-b00c21adc633","Enterprise DN","","Enterprise DN","ICDPort",483,"GC_SIP_trunk","SEP94E6F7EA5E01",12,10,0,10,0,4,0,"xxxx",10,"","",0,"",2,3,0,0,64,64,"","","","xxx","xxx",0,0,0,0,0,0,0,0,0,0,0,0,1,"EDE8BE80000100000004FC453D0E080A",3,"00000000005B4445021E1C8D00000000",0,0,0,"xxx","1755",2,"","","","","","","","",0,0,"","email","email","","","","","",0,0,0,"5280","5280","1871","1755"
input {
        file {
                path =>"/home/cucmhome/september/cdr*"
                type => "cucm-cdr"
        add_field => { "cucm_clustername" => "CUCM_Cluster_1" }
        }
        file {
                path =>"/home/cucmhome/september/cmr*"
                type => "cucm-cmr"
        add_field => { "cucm_clustername" => "CUCM_Cluster_1" }
        }
    
}

filter {
        if [type] == "cucm-cdr"{
                if ([message] =~ /^\"cdrRecordType/) {
                        drop{}
                }
                if ([message] =~ /^INTEGER/) {
                        drop{}
                }
                csv {
                        columns => ["cdrRecordType","globalCallID_callManagerId","globalCallID_callId","origLegCallIdentifier","dateTimeOrigination","origNodeId","origSpan","origIpAddr","callingPartyNumber","callingPartyUnicodeLoginUserID","origCause_location","origCause_value","origPrecedenceLevel","origMediaTransportAddress_IP","origMediaTransportAddress_Port","origMediaCap_payloadCapability","origMediaCap_maxFramesPerPacket","origMediaCap_g723BitRate","origVideoCap_Codec","origVideoCap_Bandwidth","origVideoCap_Resolution","origVideoTransportAddress_IP","origVideoTransportAddress_Port","origRSVPAudioStat","origRSVPVideoStat","destLegIdentifier","destNodeId","destSpan","destIpAddr","originalCalledPartyNumber","finalCalledPartyNumber","finalCalledPartyUnicodeLoginUserID","destCause_location","destCause_value","destPrecedenceLevel","destMediaTransportAddress_IP","destMediaTransportAddress_Port","destMediaCap_payloadCapability","destMediaCap_maxFramesPerPacket","destMediaCap_g723BitRate","destVideoCap_Codec","destVideoCap_Bandwidth","destVideoCap_Resolution","destVideoTransportAddress_IP","destVideoTransportAddress_Port","destRSVPAudioStat","destRSVPVideoStat","dateTimeConnect","dateTimeDisconnect","lastRedirectDn","pkid","originalCalledPartyNumberPartition","callingPartyNumberPartition","finalCalledPartyNumberPartition","lastRedirectDnPartition","duration","origDeviceName","destDeviceName","origCallTerminationOnBehalfOf","destCallTerminationOnBehalfOf","origCalledPartyRedirectOnBehalfOf","lastRedirectRedirectOnBehalfOf","origCalledPartyRedirectReason","lastRedirectRedirectReason","destConversationId","globalCallId_ClusterID","joinOnBehalfOf","comment","authCodeDescription","authorizationLevel","clientMatterCode","origDTMFMethod","destDTMFMethod","callSecuredStatus","origConversationId","origMediaCap_Bandwidth","destMediaCap_Bandwidth","authorizationCodeValue","outpulsedCallingPartyNumber","outpulsedCalledPartyNumber","origIpv4v6Addr","destIpv4v6Addr","origVideoCap_Codec_Channel2","origVideoCap_Bandwidth_Channel2","origVideoCap_Resolution_Channel2","origVideoTransportAddress_IP_Channel2","origVideoTransportAddress_Port_Channel2","origVideoChannel_Role_Channel2","destVideoCap_Codec_Channel2","destVideoCap_Bandwidth_Channel2","destVideoCap_Resolution_Channel2","destVideoTransportAddress_IP_Channel2","destVideoTransportAddress_Port_Channel2","destVideoChannel_Role_Channel2","IncomingProtocolID","IncomingProtocolCallRef","OutgoingProtocolID","OutgoingProtocolCallRef","currentRoutingReason","origRoutingReason","lastRedirectingRoutingReason","huntPilotPartition","huntPilotDN","calledPartyPatternUsage","IncomingICID","IncomingOrigIOI","IncomingTermIOI","OutgoingICID","OutgoingOrigIOI","OutgoingTermIOI","outpulsedOriginalCalledPartyNumber","outpulsedLastRedirectingNumber","wasCallQueued","totalWaitTimeInQueue","callingPartyNumber_uri","originalCalledPartyNumber_uri","finalCalledPartyNumber_uri","lastRedirectDn_uri"]
                        separator => ","
                }
                if "_csvparsefailure" in [tags] {
                drop{}
        }
        if "_dateparsefailure" in [tags] {
                drop{}
        }
        mutate {
            convert => [ "duration", "integer" ]
        }
        
        mutate {
                                add_field => ["dateTimeImport", "%{@timestamp}"]
                }


                date {
                        match => ["dateTimeOrigination", "UNIX"]
                }

                date {
                        match => ["dateTimeOrigination", "UNIX"]
                        target => "dateTimeOrigination_formatted"
                }

                date {
                        match => ["dateTimeConnect", "UNIX"]
                        target => "dateTimeConnect_formatted"
                }

                date {
                        match => ["dateTimeDisconnect", "UNIX"]
                        target => "dateTimeDisconnect_formatted"
                }

        
        translate {
                        field => "destMediaCap_payloadCapability"
                        destination => "destMediaCap_payloadCapability_text"
                        dictionary => [ 
                                "0","NoCodec",
                                "1","NonStandard",
                                "2","G711Alaw 64k",
                                "3","G711Alaw 56k",
                                "4","G711mu-law 64k",
                                "5","G711mu-law 56k",
                                "6","G722 64k",
                                "7","G722 56k",
                                "8","G722 48k",
                                "9","G7231",
                                "10","G728",
                                "11","G729",
                                "12","G729AnnexA",
                                "13","Is11172AudioCap",
                                "14","Is13818AudioCap",
                                "15","G.729AnnexB",
                                "16","G.729 Annex AwAnnexB",
                                "18","GSM Full Rate",
                                "19","GSM Half Rate",
                                "20","GSM Enhanced Full Rate",
                                "25","Wideband 256K",
                                "32","Data 64k",
                                "33","Data 56k",
                                "40","G7221 32K",
                                "41","G7221 24K",
                                "42","AAC",
                                "80","GSM",
                                "81","ActiveVoice",
                                "82","G726_32K",
                                "83","G726_24K",
                                "84","G726_16K",
                                "86","iLBC",
                                "89","iSAC",
                                "100","H261",
                                "101","H263",
                                "102","Vieo",
                                "103","H264",
                                "106","H224"
            ]
        }

    }
        if [type] == "cucm-cmr"{
                if ([message] =~ /^\"cdrRecordType/) {
                        drop{}
                }
                if ([message] =~ /^INTEGER/) {
                        drop{}
                }
                csv {
                        columns => ["cdrRecordType","globalCallID_callManagerId","globalCallID_callId","nodeId","directoryNum","callIdentifier","dateTimeStamp","numberPacketsSent","numberOctetsSent","numberPacketsReceived","numberOctetsReceived","numberPacketsLost","jitter","latency","pkid","directoryNumPartition","globalCallId_ClusterID","deviceName","varVQMetrics"]
            separator => ","
                }
        if "_csvparsefailure" in [tags] {
                drop{}
        }
        if "_dateparsefailure" in [tags] {
                drop{}
        }
        kv { 
            source => "varVQMetrics" 
            field_split => ";"
            prefix => "VQMetrics_"
        }

        
                mutate {
                        add_field => ["dateTimeImport", "%{@timestamp}"]
                }

                date {
                        match => ["dateTimeStamp", "UNIX"]
                }
                date {
                        match => ["dateTimeStamp", "UNIX"]
                        target => "dateTimeStamp_formatted"
                }


        }
}
output {
   elasticsearch {
#    index => "%{type}-%{+dd.MM.YYYY}" 
     index => "cucm_cdr"  
     hosts => ["localhost:9200"]
     user => ""
     password => ""
 }
}

I'm also going to comment out the sections where I drop the csv & date parse failure to see if i can easily identify which logs are having issues and why.

In my original error i saw this extra quote
image
but i dont see that in the files im opening

I think i've narrowed it down to an extra quote that is appearing on some of the logs. Can i get rid of that somehow?

Perhaps remove the extra quote using mutate+gsub if you can think of a regexp that only matches the extra quote.

""$ would match double quote at the end of a line if that can do the trick.

You know what else is weird? When I restart the logstash service, this issue goes away and then re-surfaces several days later. I'm going to keep an eye on this portion since the data between my two environments is the same and one is having trouble where the other isnt.

The only thing i can think of is that in the env. thats sending logs to the cloud i have a cron job set to delete any csv files older than a day.

Is there any logrotate jobs ?

Sorry, how would I check for that? I don't believe so.

root@mid-eslog-01:/etc# cat logrotate.conf
# see "man logrotate" for details
# rotate log files weekly
weekly

# use the adm group by default, since this is the owning group
# of /var/log/syslog.
su root adm

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

# use date as a suffix of the rotated file
#dateext

# uncomment this if you want your log files compressed
#compress

# packages drop log rotation information into this directory
include /etc/logrotate.d

# system-specific logs may be also be configured here.
root@mid-eslog-01:/etc# tail -10 /var/log/syslog
Nov 23 10:10:58 mid-eslog-01 multipathd[689]: sda: failed to get sysfs uid: Invalid argument
Nov 23 10:10:58 mid-eslog-01 multipathd[689]: sda: failed to get sgio uid: No such file or directory
Nov 23 10:11:03 mid-eslog-01 multipathd[689]: sda: add missing path
Nov 23 10:11:03 mid-eslog-01 multipathd[689]: sda: failed to get udev uid: Invalid argument
Nov 23 10:11:03 mid-eslog-01 multipathd[689]: sda: failed to get sysfs uid: Invalid argument
Nov 23 10:11:03 mid-eslog-01 multipathd[689]: sda: failed to get sgio uid: No such file or directory
Nov 23 10:11:08 mid-eslog-01 multipathd[689]: sda: add missing path
Nov 23 10:11:08 mid-eslog-01 multipathd[689]: sda: failed to get udev uid: Invalid argument
Nov 23 10:11:08 mid-eslog-01 multipathd[689]: sda: failed to get sysfs uid: Invalid argument
Nov 23 10:11:08 mid-eslog-01 multipathd[689]: sda: failed to get sgio uid: No such file or directory

Hi,

There's an include of conf at /etc/logrotate.d/

It really depends where your log are stored if they are in a default logrorate directive but i doubt it.

It could come from the way you are shipping data can you tell us more about how that csv file is built ?