Skip header line in CSV input (v 1.5.0)

whittssg · May 27, 2015, 12:23pm

Hello,

I am importing some csv files and would like to skip the first line (headers) - at the moment if the csv has headers the lines will fail to import - if i remove the headers they import fine. I did some searching and found that i am supposed to use an if statement with DROP {}:

   input {  
      file {
          path => "C:/ElasticSuite/Logs/*.csv"
          type => "mytype"
		  discover_interval => "15"
		  		 		              }
}
filter {  
	csv {
	columns => ["col1", "col2", "col3"]
         separator => ","
    }
	if [col1] == "headername" 
	{
	drop {}
	}	
mutate {
		gsub => ["ipAddress","NA","127.0.0.1"]
		gsub => ["dbLogFileSizeMb","NA","0"]
	  }
	}

I tried putting the if below the csv then below the mutate but it still failed.

Any ideas?

Thanks,

warkolm · May 28, 2015, 7:18am

Does the drop not work at all?

This general issue has been raised here, but I don't have any other advice sorry.

whittssg · May 28, 2015, 2:41pm

Thanks, I got it working by looking at a sample from within the link that you posted.. I had to had to wrap the column and string within parentheses.

if ([col1] == "headername") {
    drop { }
  }

Not sure why most of the samples on the net show it without.

Thanks again

naisanza · June 10, 2015, 4:23pm

But the bigger question here is still, "How can you have the first row parsed as the column headings, and then dropped during indexing?"

or, "How can you have the first row parsed as the column headings, and then offset the first line from being indexed?"

theuntergeek · June 10, 2015, 4:48pm

This discussion should be used to promote https://github.com/elastic/logstash/issues/2088, where a solution to this issue is proposed.

jweite · August 19, 2015, 9:08pm

I too need to process CSV files with first line schemas. Unfortunately the schemas can vary from file to file too.

Not seeing anything out there I got to work and coded up a subclass of logstash-input-file that adds CSV parsing with a first-line-schema mode. The basic CSV processing is largely borrowed from logstash-filter-csv.

Code is at https://github.com/jweite/logstash-input-csvfile. It's strictly alpha at this stage, but I'd be interested in having it considered for submission.

PS, while I initially considered enhancing logstash-filter-csv I ultimately concluded that the only 100% reliable way to restart stream processing mid-file was to re-read the file's schema row, something that only the file input plugin can always do.

warkolm · August 20, 2015, 1:59am

It'd be worth you making a new thread so it doesn't get lost

jweite · August 20, 2015, 1:01pm

Will do, thanks for the tip Mark.

Topic		Replies	Views
Logstash Drop header line while csv upload Logstash	5	3720	March 5, 2018
Why logstash csv filter skip_header param don't work Logstash	3	1898	June 1, 2022
Parse 1st Line of Multiple CSV files and set as Columns Logstash	5	3457	July 6, 2017
Drop first x line from my csv log file Logstash	2	732	March 20, 2019
Remove first line csv logstash Logstash	2	3140	January 18, 2020

Skip header line in CSV input (v 1.5.0)

Related topics