Csv parse error

Hello,

i am trying to parse a csv file but when logstash piupeline started this error is showing
my filter in logstash is :

filter {

if [type] == "soe_phys_tsm" {
                csv {
                columns => ["ParentCiNum","Parent_Id","Parent_Objet","Parent_CiAppartenanceCmdb","Parent_Classe","Parent_Type","Parent_SousType","Parent_Proprietaire","Parent_Environnement","Parent_Nom","Parent_ComplementUnicite","Relation","EnfantCiNum","Enfant_Id","Enfant_Objet","Enfant_CiAppartenanceCmdb","Enfant_Classe","Enfant_Type","Enfant_SousType","Enfant_Proprietaire","Enfant_Environnement","Enfant_Nom","Enfant_ComplementUnicite","DateCreation","DateModification","Parent_CiNumOasis","Enfant_CiNumOasis","Relation_Verbe"]
                separator => ","
                quote_char => "`"
              }
        }
}

and the error is :

{:timestamp=>"2017-11-27T09:47:19.113000+0100", :message=>"Error parsing csv", :field=>"message", :source=>"", :exception=>#<NoMethodError: undefined method `each_index' for nil:NilClass>, :level=>:warn}

so no data in showing in kibana can anyone help please

Thank you

What does the input look like?

actually filebeat is sending logs to logstash so my filebeat.yml file is

 -
  paths:
    - /app/list/logs/cmdb/soe-cici-relations.csv
  ignore_older: 1m
  document_type: soe_phys_tsm
  fields:
    env: hors-production
    client: silca
    filebeat_v: 1.0

and my logstash input is:

input {
beats {
port => 8080
}
}

it worked on other csv files but i don't know what this error means

What Christian meant was, what does the input file look like?

1 Like

it is 175150 lines log file and it is generated by a machine. i think the error is that the csv file contains quotations marks on each term like that

"Header1","Header2","Header3"
"aaa,"bbb","ccc"

how can i remove the all the "" from my csv file in logstash config file please

Then it does seem like this does not match you data. based on the sample you provided the default option should work, so try removing this parameter.

thank you i will try it

it didn't work is there a method to eliminate all the double quotes ina csv file in logstash config file ?

What does your current config look like? Can you show a full raw input event?

i have a large csv file in this format

"header1";"header2";header3".... "
aaaaaaa";"bbbbbbb";ccccccc"...
..............................
..............................

it is the input file in logstash

what should i write in logstash config file to eliminate all the double quotes

it is saying this error:

{> :timestamp=>"2017-11-27T16:56:31.108000+0100", :message=>"Pipeline main started"}

{:timestamp=>"2017-11-27T16:56:47.940000+0100", :message=>"Error parsing csv", :field=>"message", :source=>"rentCiNum";"Parent_Id";"Parent_Objet";"Parent_CiAppartenanceCmdb";"Parent_Classe";"Parent_Type";"Parent_SousType";"Parent_Proprietaire";"Parent_Environnement";"Parent_Nom";"Parent_ComplementUnicite";"Relation";"EnfantCiNum";"Enfant_Id";"Enfant_Objet";"Enfant_CiAppartenanceCmdb";"Enfant_Classe";"Enfant_Type";"Enfant_SousType";"Enfant_Proprietaire";"Enfant_Environnement";"Enfant_Nom";"Enfant_ComplementUnicite";"DateCreation";"DateModification";"Parent_CiNumOasis";"Enfant_CiNumOasis";"Relation_Verbe"", :exception=>#, :level=>:warn}

exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>, :level=>:warn}

my logstash config file is:

filter {

if [type] == "soe_phys_tsm" {
csv {
columns => ["ParentCiNum","Parent_Id","Parent_Objet","Parent_CiAppartenanceCmdb","Parent_Classe","Parent_Type","Parent_SousType","Parent_Proprietaire","Parent_Environnement","Parent_Nom","Parent_ComplementUnicite","Relation","EnfantCiNum","Enfant_Id","Enfant_Objet","Enfant_CiAppartenanceCmdb","Enfant_Classe","Enfant_Type","Enfant_SousType","Enfant_Proprietaire","Enfant_Environnement","Enfant_Nom","Enfant_ComplementUnicite","DateCreation","DateModification","Parent_CiNumOasis","Enfant_CiNumOasis","Relation_Verbe"]
separator => ";"

          }
    }

}

thank you

It does look like your test data is missing a few double quotes. I ran the following config using a corrected copy of your test data:

input {
  generator {
    lines => ['"aaaaaaa";"bbbbbbb";"ccccccc"']
    count => 1
  } 
} 

filter {
  csv {
    columns => ["ParentCiNum","Parent_Id","Parent_Objet","Parent_CiAppartenanceCmdb","Parent_Classe","Parent_Type","Parent_SousType","Parent_Proprietaire","Parent_Environnement","Parent_Nom","Parent_ComplementUnicite","Relation","EnfantCiNum","Enfant_Id","Enfant_Objet","Enfant_CiAppartenanceCmdb","Enfant_Classe","Enfant_Type","Enfant_SousType","Enfant_Proprietaire","Enfant_Environnement","Enfant_Nom","Enfant_ComplementUnicite","DateCreation","DateModification","Parent_CiNumOasis","Enfant_CiNumOasis","Relation_Verbe"]
    separator => ";"
    skip_empty_columns => true
  }
}

output {
  stdout { codec => rubydebug }
}

Which generated:

{
        "sequence" => 0,
      "@timestamp" => 2017-11-28T11:09:04.220Z,
    "Parent_Objet" => "ccccccc",
        "@version" => "1",
            "host" => "whitenode",
       "Parent_Id" => "bbbbbbb",
     "ParentCiNum" => "aaaaaaa",
         "message" => "\"aaaaaaa\";\"bbbbbbb\";\"ccccccc\""
}

Seems fine to me.

thank you in fact the inputand output are in a a config file apart and cannot be modified bcz i am using filebeat to send logs to logstash . and for each log i should create a filter in a config file apart. there some large logs file generated by a machine and i should not get rid of the double quotes manually. some log file are without double quotes like this format:
a;b;c
d;e;f
which it worked

and some

"a";"b";"c"
"d";"e";"f"

i need to get rid of all the double quotes it is 175150 lines log file!!!!!

The csv filter by default removes the double quotes when it parses a line, as you can see in my example.

thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.