I am trying to read large csv and txt file from different sources.
The sources may give us csv or txt file for same data.
I want to ignore the first the line of each files when they are first time parsed , as they represnt headers.
sample csv file content:
Center,Amount,Date
xxxx,2222,12-08-2017
xxxy,6222,12-08-2017
xxxz,2022,12-08-2017
xxxa,1222,12-08-2017
sample txt file content
Amount,Date,Center
2222,12-09-2017,xxxy
6222,12-09-2017,xxxa
2022,12-09-2017,xxxz
1222,12-09-2017,xxxc
content logstash conf file to read csv files:
input {
file {
path => "/Volumes/External/RnD/logstash/inputdata/amount*.csv"
type => "parse"
start_position => "beginning"
sincedb_path => "/Volumes/External/RnD/logstash/sincedb.txt"
}
}
filter {
if [type] == "parse" {
grok {
match => { "message" => "%{WORD:Center},(?[^,]),(?[^,])" }
}
}
}
output {
if [type] == "parse" {
if "_grokparsefailure" in [tags] {
file {
path => "/Volumes/External/RnD/logstash/outputdata/failed.txt"
codec => line { format => "%{message}"}
}
}
csv {
path => "/Volumes/External/RnD/logstash/outputdata/success.txt"
fields => ["Center", "Amount", "Date"]
}
}
}