Hello team,
I'm using the Elasticsearch, Logstash, and Kibana stack version 8.15.3, which is configured using Docker. By using Logstash, I tried to insert a CSV file data into Elasticsearch.
However, I encounter an error for specific IDs. The CSV file itself seems to be fine, as most of the data gets successfully inserted. It seems there are particular records that fail to get inserted and resulted with the following errors.
logstash | [2024-10-24T05:06:13,463][WARN ][logstash.outputs.elasticsearch][main][13ed313a5675abd23c078edd33a0a3c4be86d627339fd3bf53181d4183411c94] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"55946808", :_index=>"report", :routing=>nil}, {"fullname"=>"Meta Protocol /SISL-Germany / kunal.sharma", "build_status"=>"FAILED", "event"=>{"original"=>"2024-09-18,2024-09-18 23:19:32,2024-09-18 23:42:23,1371224,kunal.sharma,Meta Protocol /SISL-Germany / kunal.sharma,55946808,JBTF/T-PPPI/UBT/UTAH/SHOPPING,FAILED,BERLIN\r"}, "build_date"=>"2024-09-18", "message"=>"2024-09-18,2024-09-18 23:19:32,2024-09-18 23:42:23,1371224,kunal.sharma,Meta Protocol /SISL-Germany / kunal.sharma,55946808,JBTF/T-PPPI/UBT/UTAH/SHOPPING,FAILED,BERLIN\r", "build_id"=>"55946808", "@version"=>"1", "@timestamp"=>2024-10-24T05:06:12.479807535Z, "build_end_time"=>"2024-09-18 23:42:23", "build_requester"=>"kunal.sharma", "build_conf"=>"JBTF/T-PPPI/UBT/UTAH/SHOPPING", "build_site"=>"BERLIN", "log"=>{"file"=>{"path"=>"/usr/share/logstash/pipeline/file.csv"}}, "build_duration"=>"1371224", "host"=>{"name"=>"042fd1c8b928"}, "build_start_time"=>"2024-09-18 23:19:32"}], :response=>{"index"=>{"status"=>400, "error"=>{"type"=>"document_parsing_exception", "reason"=>"[1:868] failed to parse field [host] of type [text] in document with id '55946808'. Preview of field's value: '{name=042fd1c8b928}'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Expected text at 1:846 but found START_OBJECT"}}}}}
my logstash conf
as follows.
input {
beats {
port => 5044
}
tcp {
port => 5000
}
file {
path => "/usr/share/logstash/pipeline/*.csv"
start_position => "beginning"
sincedb_path => "/usr/share/logstash/pipeline/sincedb.txt"
}
}
filter {
csv {
separator => ","
columns => ["build_date", "build_start_time", "build_end_time", "build_duration", "build_requester", "fullname", "build_id", "build_conf", "build_status", "build_site"]
}
}
output {
elasticsearch {
action => "index"
hosts => "http://elasticsearch:9200"
index => "report"
user => "elastic"
password => "************"
document_id => "%{build_id}"
}
stdout {}
}
Could you please help identify the cause of this issue and suggest a solution?