Reading the heading (1st line) of CSV file through logstash

kiranilla · November 23, 2017, 5:53pm

HI,

Eg: sample CSV file

Incident ID,Status,Resolved By,Resolution Breached,Resolution Month,Resolution Date,Resolution Date & Time,
IM02370568,Closed,GUNTURS2,FALSE,1,1/4/2016,1/4/2016 10:35
IM02370648,Closed,PRASADR6,FALSE,1,1/1/2016,1/1/2016 22:51

i am trying to read a the above csv file
The heading(1st line) need to be read and pass it to filter section of csv plugin
The data also need to be read and pass it further for parsing.

i would like to pass the heading dynamically to csv plugin of logstash, instead of hardcoding the headings.
Could you please help me out in resolving this issue or at-least an alternative for this problem.

Thanks in advance.

guyboertje · November 23, 2017, 6:53pm

The CSV filter is stateless, meaning that it handles each event without knowing anything about any previous events. Also the config is parsed and loaded at LS start before any events have been processed.

There is no way at the moment to do this very dynamically.

How many different CSV structures do you need to parse?
Does the filename and/or path contain a clue to the type of CSV structure?

Leandro_Sampaio · November 23, 2017, 7:13pm

Maybe you could use a grok condition only the first line and keep it in session... It's not a good practice, but it could resolve your problem...

BinaryMonkey · November 24, 2017, 12:46am

filter {
csv {
separator => ","
autodetect_column_names => true
autogenerate_column_names => true
}
}

kiranilla · November 24, 2017, 6:01am

Thanks @guyboertje for you quick response..

As per your queries below:

How many different CSV structures do you need to parse?

> There is no limit for CSV structures, at present we have done for 2 but in future we are expecting more, so would like to generalize for all the structures.

Does the filename and/or path contain a clue to the type of CSV structure?

> Yes, filepath can be in one location(we can hard code the path) but filename varies.

guyboertje · November 24, 2017, 11:34am

As @BinaryMonkey has suggested you can try the setting the first or both of:

autodetect_column_names => true
autogenerate_column_names => true

See https://github.com/logstash-plugins/logstash-filter-csv/blob/master/lib/logstash/filters/csv.rb#L126
The line of code above will capture the first event seen by the plugin (on all LS restarts) as the column names.

The real pitfall with this is:

Only the very first line of the very first file will be the columns for all other files - because the CSV filter does not keep a map of file_path -> columns internally.
If you have to restart Logstash while it is half way though a file then the first line of that file will not be re-read and the columns will become some arbitrary set of values.

guyboertje · November 24, 2017, 11:47am

So if you have two different CSV structures, say structure-1 with columns "a","b" and structure-2 with "c","d"
then...
Can you put all files with structure-1 into a sub-folder called structure-1 and so on?
If so then you can use regex if conditionals to separate the files so that they flow through their own csv filter. You still have to take into account the second pitfall of my previous post unless you hard code the columns for each structure.

Logstash do not have an automatic way of doing this, because by design we strive for statelessness and the file input (and Filebeat) is designed to resume reading from where reached the previous time.

kiranilla · November 26, 2017, 6:00pm

Thanks @guyboertje for your response.

As suggested, i will check the feasibility and go ahead with implementation.

system · December 24, 2017, 6:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsing csv file with dynamic headers/columns Logstash	7	3276	June 13, 2017
Logstash Filter CSV - Multiple Header Logstash	1	757	December 31, 2018
Parse 1st Line of Multiple CSV files and set as Columns Logstash	5	3458	July 6, 2017
Logstash: CSV filter pattern-based field name detection from header row Logstash	6	2022	October 15, 2019
Logstash CSV filter plugins Logstash	2	462	September 28, 2022

Reading the heading (1st line) of CSV file through logstash

Related topics