Unable to index csv data

Hi All,

I have been trying to upload csv data using logstash, but was unable to index the data and getting error like characterset miss match,

I was able to identify these warnings are occurring only for the field values where empty spaces are there (for example if there is a ticket is not resolved then those "Resolved_Date" column will be empty)

Here is the error

[2018-11-13T14:13:25,886][WARN ][logstash.codecs.plain    ] Received an event that has a different character encoding than you configured. {:text=>"INC0230207,UA-SAP-UK,10-25-18 15:38,SAP workflow in transaction ZMMBV duplicates purchaseorder workflow item,TCS-SAP-DC,Self-service,FALSE,Ren\\x82 Wijnenga,Low,,,Assigned,a306815,,,\\r", :expected_charset=>"UTF-8"}
[2018-11-13T14:13:26,062][WARN ][logstash.filters.csv     ] Error parsing csv {:field=>"message", :source=>"INC0230914,UA-SAP-UK,10-26-18 14:39,SAP password reset / account unlock SAP-ID: B015060,SD-INCIDENTS,Email,FALSE,Simon Edwards,Support,10-26-18 14:42,,Resolved,a323921,,Password reset / Unlocked Account,\"Password reset done.", :exception=>#<CSV::MalformedCSVError: Unclosed quoted field on line 1.>}

Here is my config file:

input {
  file {
    path => "/opt/installables/csv/Book1.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
 }
}
filter {
  csv {
      separator => ","
      columns => ["Number","Configuration_item","Opened","Short_description","Assignment_group","Contact_type","Major_incident","Caller","Priority","Resolved","Closed","Status","Updated_by","Closed_by","Category","Close_notes"]
  }
  }
output {
#   elasticsearch {
#     hosts => "1.12.1.3:9200"
#     index => "incanalysis"
#  }
  stdout {
    codec => rubydebug
  }
}

Any suggestions where i'm doing wrong

Thanks
Gauti

All looks correct.
I assume you have following line uncommented when you run it

elasticsearch {

hosts => "1.12.1.3:9200"

index => "incanalysis"

}

can you post how is your csv file looks like. just one-two line from it

@elasticforme ven if i enable to send to elasticsearch still i get the same warning message, other than the warning items remaining documents are getting indexed.

Here is the sample of my csv

Number Configuration_item Opened Short_description Assignment_group Contact_type Major_incident Caller Priority Resolved Closed Status Updated_by Closed_by Category Close_notes
INC0230928 DOM-CE 10-26-18 14:53 Domain Password reset SD-INCIDENTS Phone FALSE Eggermont Support 10-26-18 14:55 Resolved a044 Password reset / Unlocked Account Reset the users password
INC0230927 DOM-EMAIL 10-26-18 14:51 FUSE: Office 365 Teams EMAIL Phone FALSE Dorman Low Assigned a015
INC0230926 UA-SAP-UK 10-26-18 14:45 SAP password reset / account unlock SAP-ID: A018 SD-INCIDENTS Email FALSE ravi ravi Support 10-26-18 14:47 Resolved a006 Password reset / Unlocked Account Password reset done sent an email to the user.

Thanks
Gauti

try with skip_empty_columns on filter field.

https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html#plugins-filters-csv-skip_empty_columns

@elasticforme i havent done this skip_empty_rows before can you please share an example syntax on how to write it down in the config,

Also have identified there are few special characters which are also causing the issue.

Now two things "empty rows and special characters" is there any way by which we can ignore these two and proceed indexing,
I also tried to write a mapping for all those fields in the csv and wrote an ignore malformed for the whole index and ended up with failure.

Any advice on this scenario

Thanks
Gauti

sorry don't know much about how to ignore mailform character. this is all new to me as well.

@elasticforme Any advice on skipping special characters?

Thanks
Gauti

I think you can do mutate. I show a example somewhere
filter {
if [message_id] {
mutate { add_field =>; { "[[@metadata][id]" => "%{message_id}" } }
} else {
mutate { add_field => { "[[@metadata][id]" => nil } }
}
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.