How to remove multiple head lines in CSV with logstash configuration file

Hello All,

I need handle CSV files from Windows File Resource manage. The file has multiple head lines that need to remove. How can I do it with setting logstash configuration file. The following is a sample.

"Large Files Report"

Generated at: "4/24/2020 10:03:51 AM"

"Lists files that are a specified size or larger. Use this report to quickly identify the files that are consuming the most disk space on the server. These can help you quickly reclaim large quantities of disk space."

Report settings:
Machine: ,"AAA-FP01"
Report Folders: ,"C:","F:",
Parameters: ,"Minimum file size: 300 MB"

Report Totals
Files shown in the report
Files,Total size on Disk
"9","24,384 MB"
All files matching report criteria
Files,Total size on Disk
"9","24,384 MB"

Size by Owner
Owner,Size on Disk,Files
"BUILTIN\Administrators","22,905 MB","7"
"NT AUTHORITY\SYSTEM","1,479 MB","2"

Size by File Group
File Group,Size on Disk,Files
"Compressed Files","1,420 MB","1"
"System Files","1,024 MB","1"
"Executable Files","358 MB","1"
"All other files","21,581 MB","6"

Report statistics:
File name,Folder,Owner,Size on Disk,Size,Last accessed
"37{3808876b-c176-4e48-b7ae-04046e6cc752}","F:\System Volume Information","BUILTIN\Administrators","15,104 MB","15,104 MB","4/24/2020 10:03:51 AM"
"SW_DVD9_SQL_Svr_Standard_Edtn_2008_R2_English_MLF_X16-29588.ISO","F:\sql2008r2","BUILTIN\Administrators","4,177 MB","4,177 MB","4/19/2012 2:33:34 PM"
"Windows10.0-KB4530689-x64.cab","C:\Windows\ccmcache\8","BUILTIN\Administrators","1,420 MB","1,420 MB","12/11/2019 11:03:17 PM"
"DataStore.edb","C:\Windows\SoftwareDistribution\DataStore","NT AUTHORITY\SYSTEM","1,158 MB","1,158 MB","1/16/2020 1:46:28 AM"
"pagefile.sys","C:","BUILTIN\Administrators","1,024 MB","1,024 MB","1/16/2020 5:43:44 PM"
"f_cache2.dat","C:\Program Files\avs\var","BUILTIN\Administrators","512 MB","512 MB","4/13/2020 11:01:59 PM"
"SQLServer2008R2SP3-KB2979597-x64-ENU.exe","F:\sql2008r2","BUILTIN\Administrators","358 MB","358 MB","5/16/2015 10:54:20 AM"
"4e825086372e1e44_blobs.bin","C:\Windows\WinSxS\ManifestCache","NT AUTHORITY\SYSTEM","321 MB","321 MB","1/16/2020 4:32:49 PM"
"Winre.wim","C:\Recovery\WindowsRE","BUILTIN\Administrators","309 MB","309 MB","11/21/2016 3:47:51 PM"

======================

  1. I want to remove multiple lines until to "File name, Folder, Owner.... “ this line.
  2. If ok. remaind the line "Machine: ,"AAA-FP01"". Because I have many clients. So that I can sort them.

Thanks.

How do you read the input to determine what csv you wanna parse?

If you can separate those inputs properly, you could drop those headers by creating simple condition:

if [message] ~= /Owner,Size on Disk,Files/{
drop {}
}

You can also use tags when reading the input from multiple sources etc.

If you wanna remove headers in all messages at once, you can search for them using:

if [message] ~= /Owner,Size on Disk,Files|File Group,Size on Disk,Files/{
drop {}
}