I am trying to re-read a csv file from the beginning using sincedb set to '/dev/null'. logstash is started using --config.reload.automatic. All entries are blank when the file is re-read. This is not consistent, some times data is read, sometimes not. to trigger re-parsing, i just open csvread.config and add/remove spaces at the end.
command line : bin/logstash -f csvread.config --config.reload.automatic
Configuration file :
input {
file {
path => "/home/whoami/result6.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
autodetect_column_names => true
autogenerate_column_names => true
}
mutate {
add_field => { "out_timestamp" => "%{@timestamp}"}
}
mutate {
rename => {
"active" => "ACTIVE"
"state_name" => "STATE" }
}
mutate {
update => { }
}
ruby {
code =>
'event.set("OTH_MAPPING",[])'
}
prune {
whitelist_names => ["out_timestamp", "^ACTIVE$", "^STATE$", "^OTH_MAPPING$"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "csv_read"
}
}
result6.csv :
sno,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10
2,column info,75,5175,5038,62,88,5190,5040,62,35
1,column info 1,18666,921906,895949,7291,20954,925401,897147,7300,28
sometimes csv is read as below which is an issue :
{
"OTH_MAPPING" => [],
"out_timestamp" => "2021-06-24T12:22:36.555Z"
}
{
"OTH_MAPPING" => [],
"out_timestamp" => "2010-06-24T12:22:36.555Z"
}
{
"OTH_MAPPING" => [],
"out_timestamp" => "2010-06-24T12:22:36.556Z"
}
ideally it should have read as :
{
"ACTIVE" => "75",
"OTH_MAPPING" => [],
"out_timestamp" => "2010-06-24T12:22:18.434Z",
"STATE" => "column0"
}
{
"ACTIVE" => "32",
"OTH_MAPPING" => [],
"out_timestamp" => "2010-06-24T12:22:18.435Z",
"STATE" => "Column1"
}