Error parsing csv


#1

Hi,

I'm checking my logstash logs and see many entries with WARN "Error parsing csv".

Example:
[2017-06-01T13:09:16,380][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"8:0.0/0.0\",\"m-send-req\",\"application/smil|image/jpeg|text/plain\",\"686|358810|312\",\"application/smil|image/jpeg|text/plain\",\"686|358810|312\",\"SonyE2303/26.1.A.3.111\",\"texttext\",\"texttext\",\"No\",\"1 of 1\",\"texttext\"", :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}

It doesn't seem to be any problems with the CSV line:
"1","1","000000000","1","1","000000000","1","1","000000000","2261ef58","205E8751","30616863","1496315165","1496315168","0","0","1496747165","0","1048576","1","000000000:205E8751:","0","0","2","1","0","0","0","0","000000000","000000000","000000000","000000000","30","0","0","","texttext","1496315168:0.0/0.0","m-send-req","application/smil|image/jpeg|text/plain","686|358810|312","application/smil|image/jpeg|text/plain","686|358810|312","SonyE2303/26.1.A.3.111","POSTPAID","texttext","No","1 of 1","texttext"

Or did I miss something? I changed the sensitive text and numbers, but basically that's it.

Another example:
[2017-06-01T13:12:17,874][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"\",\"0\",\"1496747390\",\"0\",\"1048576\",\"0\",\"000000000:205DACD6:\",\"0\",\"0\",\"2\",\"2\",\"0\",\"0\",\"0\",\"0\",\"000000000\",\"000000000\",\"000000000\",\"000000000\",\"30\",\"0\",\"0\",\"\",\"texttext\",\"1496315390:1.310/1.310|1496315393:0.0/0.0\",\"m-send-req\",\"application/smil|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg\",\"1092|34081|44671|32513|49863|30477|32373|46123\",\"application/smil|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg\",\"1432|34120|44740|32558|49942|30516|32418|46194\",\"iPhoneOS/10.3.1 (14E304)\",\"texttext\",\"texttext\",\"\",\"1 of 1\",\"texttext\"", :exception=>#<CSV::MalformedCSVError: Unclosed quoted field on line 1.>}

The log entry:
"1","1","000000000","1","1","000000000","1","1","000000000","22619548","205DACD6","30288540","1496315390","1496315393","1496747390","0","1496747390","0","1048576","0","000000000:205DACD6:","0","0","2","2","0","0","0","0","000000000","000000000","000000000","000000000","30","0","0","","texttext","1496315390:1.310/1.310|1496315393:0.0/0.0","m-send-req","application/smil|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg","1092|34081|44671|32513|49863|30477|32373|46123","application/smil|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg|image/jpeg","1432|34120|44740|32558|49942|30516|32418|46194","iPhoneOS/10.3.1 (14E304)","texttext","texttext","","1 of 1","texttext"

Any ideas? Maybe parsing problems because of :/| characters?

Thank you!
Mario


#2

Any ideas here? Do you need any more info?

Thanks!
Mario


Reindex - total vs. created
(Christian Dahlqvist) #3

Which version of Logstash are you using? What does your configuration look like? I tried it, but was not able to reproduce your issue.


#4

If use the same input file, same settings on my testing enviroment, i don't get any errors.

Could these be a hardware/resource issue? Server isn't overloaded. Maybe a tweak in configuration necessary?

Logstash is version 5.4.

Conf.d:

INPUT:
input {
file {
path => ["/logs/smsc/bkki_", "/logs/mmsc/mmsc_elk_"]
start_position => "beginning"
type => "logs"
max_open_files => 50000
}
}

FILTER:

   filter {
        if "/logs/smsc/bkki_" in [path] {
        csv {
        columns => ['sender','recipient','sender_shortcode','recipient_shortcode','sender_imsi','recipient_imsi','sender_msc','recipient_msc','sender_esme','recipient_esme','sender_protocol','recipient_protocol','requested_delivery','receipt','sender_prepaid','recipient_prepaid','submission_time','delivery_attempts','delivery_time','i_error','o_error','final_state','cluster_name','serving_smsc','total_segments','sender_cell_id','recipient_cell_id','ppc_assoc','protect_action','protect_rule','protect_condition','protect_error','xf1','xf2','xf3','xf4','xf5']
        }

        date {
        match => ['submission_time', 'UNIX']
        }

        date {
        match => ['submission_time', 'UNIX']
        target => 'submission_time'
        }

        if ['delivery_time'] == 0 {
        mutate {
        convert => { "delivery_time" => "integer" }
        }
        } else {
        date {
        match => ['delivery_time', 'UNIX']
        target => 'delivery_time'
        }
        }

        if [i_error] == "1:1" {
        mutate {
        replace => [ "i_error", "INTERNAL_STORAGE_ERROR" ]
               }
        }

---more if conditionals like the one above (about 30 more)---

mutate {
remove_field => [ "message" ]
#       remove_field => [ "msg_text" ]
}
}

Similar for files in second path:

    if "/logs/mmsc/mmsc_elk_" in [path] {
    csv {
    columns => ['oa_ton','oa_npi','oa_addr','da_ton','da_npi','da_addr','divert_from_ton','divert_from_npi','divert_from_addr','thread_id','msg_id','storage_ref','submission_time','delivery_time','retry_time','schedule_delivery_time','expiry_time','priority','esm_class','registered_delivery','queue_id','protocol_id','data_coding','final_state','delivery_attempts','o_error_type','o_error_value','i_error_type','i_error_value','orig_idnt','orig_locn','dest_idnt','dest_locn','msg_type','charset','pre_submission_len','msg_text','cluster_name','history_error','mms_msg_type','orig_content_types','orig_content_sizes','dest_content_types','dest_content_sizes','UAProf','mms_charge_type','apn','mms_read_report','rcpt_num','cluster_name']
    }

    date {
    match => ['submission_time', 'UNIX']
    }

    date {
   match => ['submission_time', 'UNIX']
    target => 'submission_time'
    }

    if ['delivery_time'] == 0 {
    mutate {
    convert => { "delivery_time" => "integer" }
    }
    } else {
    date {
   match => ['delivery_time', 'UNIX']
    target => 'delivery_time'
   }
    }

    if ['retry_time'] == 0 {
    mutate {
    convert => { "retry_time" => "integer" }
    }
    } else {
    date {
   match => ['retry_time', 'UNIX']
   target => 'retry_time'
   }
    }

    if ['schedule_delivery_time'] == 0 {
    mutate {
    convert => { "schedule_delivery_time" => "integer" }
    }
    } else {
    date {
   match => ['schedule_delivery_time', 'UNIX']
   target => 'schedule_delivery_time'
   }
    }

    if ['expiry_time'] == 0 {
    mutate {
    convert => { "expiry_time" => "integer" }
    }
    } else {
    date {
    match => ['expiry_time', 'UNIX']
    target => 'expiry_time'
    }
    }

    if [final_state] == "1" {
    mutate {
    replace => [ "final_state", "ENROUTE" ]
            }
    }

---some more if conditional like the one above (about 6 more)---

    mutate {
    remove_field => [ "message" ]
    remove_field => [ "msg_text" ]
    }
    }
 }

OUTPUT:
output {
if "/logs/smsc/bkki_" in [path] {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "smsc-%{+YYYY.MM.dd}"
user => ***
password => ***
}
}
if "/logs/mmsc/mmsc_elk_" in [path] {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "mmsc-%{+YYYY.MM.dd}"
user => ***
password => ***
}
}
}

That's basically it.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.