Remove first few lines of csv in logstash

Osiris · May 19, 2021, 4:46am

This is the input I am using for logstash.

    ItemId  AssetId ItemName    Comment
    11111   07  ABCDa   XYZa
    11112   07  ABCDb   XYZb
    11113   07  ABCDc   XYZc
    11114   07  ABCDd   XYZd
    11115   07  ABCDe   XYZe
    11116   07  ABCDf   XYZf
    11117   07  ABCDg   XYZg
    Date    Time    rows    columns
    19-05-2020  13:03    2   2
    19-05-2020  13:03    2   2
    19-05-2020  13:03    2   2
    19-05-2020  13:03    2   2
    19-05-2020  13:03    2   2

I need to remove first 8 lines from the csv and make the next line as column header and parse rest of lines as usual. Is there a way to do that in logstash?

Badger · May 19, 2021, 4:42pm

I would use a multiline codec to combine each group of lines. Hopefully something like

pattern => "^\d" negate => false what => previous

would work, so that you get two events. The first being

"ItemId  AssetId ItemName    Comment\n11111   07  ABCDa   XYZa\n11112   \n7  ABCDb   XYZb\n11113   07  ABCDc   XYZc\n11114   07  ABCDd   XYZd\n11115   07  ABCDe   XYZe\n11116   07  ABCDf   XYZf\n11117   07  ABCDg   XYZg"

You may need to clean up the field separators

mutate { gsub => [ "message", "\s+", " " ] }

Then tag each event

if [message] =~ /^Item/ {
    mutate { add_field => { "[@metadata][format]" => "format1" } }
} else {
    mutate { add_field => { "[@metadata][format]" => "format2" } }
}

Then use mutate+split to split [message] into an array of lines, then a split filter to convert the array into multiple events. You can then use csv filters for the two formats

if [@metadata][format] == "format1" {
    csv { separator => " " columns => [ "ItemId", "AssetId", "ItemName",  "Comment" ] ... }
    if [ItemId] == "ItemId" { drop {} }
} else {
    csv { separator => " " columns => [ "Date", "Time", "rows", "columns" ] ... }
    if [Date] == "Date" { drop {} }
}

There are other ways of handling the headers (autodetect_column_names for example) but then you need pipeline.workers to be one and pipeline.ordered to be true.

system · June 16, 2021, 4:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Remove first line csv logstash Logstash	2	3140	January 18, 2020
Drop first x line from my csv log file Logstash	2	732	March 20, 2019
How to remove a line break from a field in logstash Logstash	2	1382	June 27, 2019
Multilined csv error while parsing Logstash	1	164	November 7, 2023
CSV parsing meaning get rid of logstash csv plugin? Logstash	1	378	December 25, 2019

Remove first few lines of csv in logstash

Related topics