Hi,
I have CSV file as below
col1,col2,col3,col4
A,B,C,D
,,E,F
,G,H
,,,I
J,K,,,
More then 10 milion records,
I am using below config code as :-
input {
file {
path => "/home/nandan/data/data.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["col1","col2","col3","col4"]
}
mutate {convert => ["col1","integer"] }
mutate {convert => ["col3","integer"] }
}
output {
elasticsearch {
hosts => "localhost"
index => "hotel"
document_type => "rooms"
}
stdout{}
}
but only half data are able to index into elasticsearch,
Error is :-
[ERROR] 2018-05-15 14:43:56.918 [LogStash::Runner] Logstash - org.jruby.exceptions.ThreadKill
[WARN ] 2018-05-15 14:43:56.939 [Ruby-0-Thread-9@[main]>worker0: /usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:385] csv - Error parsing csv {:field=>"message", :source=>"",34103,,UNITED STATES - USA,26.1858,-81.799\r", :exception=>#<CSV::MalformedCSVError: Unclosed quoted field on line 1.>}
[WARN ] 2018-05-15 14:43:56.948 [Ruby-0-Thread-9@[main]>worker0: /usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:385] csv - Error parsing csv {:field=>"message", :source=>"81512,supplier2,Chinatown Hotel,YAOWARAJ SUMPUNTAWONG 526,"Bangkok", :exception=>#<CSV::MalformedCSVError: Unclosed quoted field on line 1.>}
and only half data indexed.. Please tell me why is it happening.
Thanks