I am importing some csv files and would like to skip the first line (headers) - at the moment if the csv has headers the lines will fail to import - if i remove the headers they import fine. I did some searching and found that i am supposed to use an if statement with DROP {}:
I too need to process CSV files with first line schemas. Unfortunately the schemas can vary from file to file too.
Not seeing anything out there I got to work and coded up a subclass of logstash-input-file that adds CSV parsing with a first-line-schema mode. The basic CSV processing is largely borrowed from logstash-filter-csv.
PS, while I initially considered enhancing logstash-filter-csv I ultimately concluded that the only 100% reliable way to restart stream processing mid-file was to re-read the file's schema row, something that only the file input plugin can always do.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.