Using the Logstash CSV filter to parse my CSV file, I set the skip_header parameter to true. I wanted the first line not to be printed, but the actual result program didn't skip the first line。
here is my config:
input {
file {
path => ["/usr/local/air_logs/*"]
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
skip_header => true
columns => ["site","parameter","date","year","month","day","hour","value","unit","duration","name"]
convert => {
"value" => "integer"
}
}
date {
match => ["date","yyyy-MM-dd HH:mm", "M/d/yyyy H:mm"]
target => "date"
}
}
output {
stdout {
codec => "rubydebug"
}
}
here is my console print:
{
"@version" => "1",
"site" => "Site",
"@timestamp" => 2022-05-04T03:24:46.819Z,
"month" => "Month",
"hour" => "Hour",
"value" => "Value",
"date" => "Date (LST)",
"parameter" => "Parameter",
"day" => "Day",
"path" => "/usr/local/air_logs/Beijing_2008_HourlyPM2.5_created20140325.csv",
"duration" => "Duration",
"host" => "0.0.0.0",
"unit" => "Unit",
"name" => "QC Name",
"year" => "Year",
"message" => "Site,Parameter,Date (LST),Year,Month,Day,Hour,Value,Unit,Duration,QC Name\r",
"tags" => [
[0] "_dateparsefailure"
]
}
{
"@version" => "1",
"site" => "Beijing",
"@timestamp" => 2022-05-04T03:24:46.877Z,
"month" => "4",
"hour" => "15",
"value" => 207,
"date" => 2008-04-08T07:00:00.000Z,
"parameter" => "PM2.5",
"day" => "8",
"path" => "/usr/local/air_logs/Beijing_2008_HourlyPM2.5_created20140325.csv",
"duration" => "1 Hr",
"host" => "0.0.0.0",
"unit" => "µg/mg³",
"name" => "Valid",
"year" => "2008",
"message" => "Beijing,PM2.5,2008-04-08 15:00,2008,4,8,15,207,µg/mg³,1 Hr,Valid\r"
}
here is my csv file:
Site,Parameter,Date (LST),Year,Month,Day,Hour,Value,Unit,Duration,QC Name
Beijing,PM2.5,2008-04-08 15:00,2008,4,8,15,207,µg/mg³,1 Hr,Valid
Beijing,PM2.5,2008-04-08 16:00,2008,4,8,16,180,µg/mg³,1 Hr,Valid
very worry