Get last folder form path in Logstash

I need to extract the name of the last folder in the path where the file plugin caught a csv file and then create a field that has the value of the folder, which it is a taskId in my case.

Here is an example of what I'm doing, but not sure if this is correct or there's a better way to achieve it

input {
  file {
    path => "/usr/share/input/**/*.csv"
    start_position => beginning
    sincedb_path => "/dev/null"
    discover_interval => 2
    stat_interval => "1 s"
  }
}

filter {
   grok { match => ["path", "^/[^/]+/[^/]+/[^/]+/[^/]+/(?<taskId>[^/]+)" ] }
}

output {
    stdout { codec => rubydebug }
    file {
        path => "/usr/share/output/output.json"
        codec => "json_lines"
    }
}

and here's the output I'm getting:

{"@timestamp":"2019-08-20T19:35:29.365Z","message":"Sandy,Test text,1.25,f45,3\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
{"@timestamp":"2019-08-20T19:35:29.364Z","message":"Mike,Hello,11.5,12A,1\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
{"@timestamp":"2019-08-20T19:35:29.364Z","message":"Nicolas,Test Test,0.25,13B,2\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}
{"@timestamp":"2019-08-20T19:35:29.338Z","message":"user_name,text,size,output,new_column\r","host":"b7ee10dcd65c","@version":"1","path":"/usr/share/input/jobId/134598abcr4_f89/sample.csv","taskId":"134598abcr4_f89"}

Maybe there is more generic way to do it, if there's I'd really appreciate the help.

I would anchor to the end so that it does not matter what depth in the tree the file is.

grok { match => { "path" => "/(?<taskId>[^/]+)/[^/]+$" } }
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.