Unable to index csv data

Gauti · November 13, 2018, 10:24am

Hi All,

I have been trying to upload csv data using logstash, but was unable to index the data and getting error like characterset miss match,

I was able to identify these warnings are occurring only for the field values where empty spaces are there (for example if there is a ticket is not resolved then those "Resolved_Date" column will be empty)

Here is the error

[2018-11-13T14:13:25,886][WARN ][logstash.codecs.plain    ] Received an event that has a different character encoding than you configured. {:text=>"INC0230207,UA-SAP-UK,10-25-18 15:38,SAP workflow in transaction ZMMBV duplicates purchaseorder workflow item,TCS-SAP-DC,Self-service,FALSE,Ren\\x82 Wijnenga,Low,,,Assigned,a306815,,,\\r", :expected_charset=>"UTF-8"}
[2018-11-13T14:13:26,062][WARN ][logstash.filters.csv     ] Error parsing csv {:field=>"message", :source=>"INC0230914,UA-SAP-UK,10-26-18 14:39,SAP password reset / account unlock SAP-ID: B015060,SD-INCIDENTS,Email,FALSE,Simon Edwards,Support,10-26-18 14:42,,Resolved,a323921,,Password reset / Unlocked Account,\"Password reset done.", :exception=>#<CSV::MalformedCSVError: Unclosed quoted field on line 1.>}

Here is my config file:

input {
  file {
    path => "/opt/installables/csv/Book1.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
 }
}
filter {
  csv {
      separator => ","
      columns => ["Number","Configuration_item","Opened","Short_description","Assignment_group","Contact_type","Major_incident","Caller","Priority","Resolved","Closed","Status","Updated_by","Closed_by","Category","Close_notes"]
  }
  }
output {
#   elasticsearch {
#     hosts => "1.12.1.3:9200"
#     index => "incanalysis"
#  }
  stdout {
    codec => rubydebug
  }
}

Any suggestions where i'm doing wrong

Thanks
Gauti

elasticforme · November 13, 2018, 5:11pm

All looks correct.
I assume you have following line uncommented when you run it

elasticsearch {

hosts => "1.12.1.3:9200"

index => "incanalysis"

}

can you post how is your csv file looks like. just one-two line from it

Gauti · November 14, 2018, 6:48am

@elasticforme ven if i enable to send to elasticsearch still i get the same warning message, other than the warning items remaining documents are getting indexed.

Here is the sample of my csv

Number	Configuration_item	Opened	Short_description	Assignment_group	Contact_type	Major_incident	Caller	Priority	Resolved	Status	Updated_by	Category	Close_notes
INC0230928	DOM-CE	10-26-18 14:53	Domain Password reset	SD-INCIDENTS	Phone	FALSE	Eggermont	Support	10-26-18 14:55	Resolved	a044	Password reset / Unlocked Account	Reset the users password
INC0230927	DOM-EMAIL	10-26-18 14:51	FUSE: Office 365 Teams	EMAIL	Phone	FALSE	Dorman	Low		Assigned	a015
INC0230926	UA-SAP-UK	10-26-18 14:45	SAP password reset / account unlock SAP-ID: A018	SD-INCIDENTS	Email	FALSE	ravi ravi	Support	10-26-18 14:47	Resolved	a006	Password reset / Unlocked Account	Password reset done sent an email to the user.

Thanks
Gauti

elasticforme · November 14, 2018, 3:14pm

try with skip_empty_columns on filter field.

https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html#plugins-filters-csv-skip_empty_columns

Gauti · November 15, 2018, 7:55am

@elasticforme i havent done this skip_empty_rows before can you please share an example syntax on how to write it down in the config,

Also have identified there are few special characters which are also causing the issue.

Now two things "empty rows and special characters" is there any way by which we can ignore these two and proceed indexing,
I also tried to write a mapping for all those fields in the csv and wrote an ignore malformed for the whole index and ended up with failure.

Any advice on this scenario

Thanks
Gauti

elasticforme · November 15, 2018, 2:57pm

sorry don't know much about how to ignore mailform character. this is all new to me as well.

Gauti · November 15, 2018, 4:08pm

@elasticforme Any advice on skipping special characters?

Thanks
Gauti

elasticforme · November 15, 2018, 8:24pm

I think you can do mutate. I show a example somewhere
filter {
if [message_id] {
mutate { add_field =>; { "[[@metadata][id]" => "%{message_id}" } }
} else {
mutate { add_field => { "[[@metadata][id]" => nil } }
}
}

system · December 13, 2018, 8:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Csv upload error in logstash Logstash	20	1644	January 11, 2019
Unable to index csv to elasticsearch (MalformedCSVError: Unclosed quoted field) Logstash	5	1382	January 31, 2020
Problems when indexing csv files Elasticsearch	3	1617	June 22, 2019
Handle Blank data in CSV - Ingest By Using Logstash Logstash	2	1240	June 12, 2018
Filter csv Logstash Logstash	11	756	April 24, 2019

Unable to index csv data

elasticsearch {

hosts => "1.12.1.3:9200"

index => "incanalysis"

}

Related topics