I need to extract the name of the last folder in the path where the file plugin caught a csv file and then create a field that has the value of the folder, which it is a taskId in my case.
Here is an example of what I'm doing, but not sure if this is correct or there's a better way to achieve it
input {
file {
path => "/usr/share/input/**/*.csv"
start_position => beginning
sincedb_path => "/dev/null"
discover_interval => 2
stat_interval => "1 s"
}
}
filter {
grok { match => ["path", "^/[^/]+/[^/]+/[^/]+/[^/]+/(?<taskId>[^/]+)" ] }
}
output {
stdout { codec => rubydebug }
file {
path => "/usr/share/output/output.json"
codec => "json_lines"
}
}
and here's the output I'm getting:
{"@timestamp":"2019-08-20T19:35:29.365Z","message":"Sandy,Test text,1.25,f45,3\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
{"@timestamp":"2019-08-20T19:35:29.364Z","message":"Mike,Hello,11.5,12A,1\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
{"@timestamp":"2019-08-20T19:35:29.364Z","message":"Nicolas,Test Test,0.25,13B,2\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
{"@timestamp":"2019-08-20T19:35:29.338Z","message":"user_name,text,size,output,new_column\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
Maybe there is more generic way to do it, if there's I'd really appreciate the help.