Extracting path fragments from `log.file.path` field

Hi,

I'm trying to extract two strings from the metadata log.file.path field, and create one or two fields, depending on whether the first of these fields has the appropriate value.

Sample log.file.path field content of event are as follows:

  • /usr/share/logstash/ingest_data/MY-APP-B/20240517/MY-APP-B-JOBNAMEX-28599000-fqlg6-12.log`
  • /usr/share/logstash/ingest_data/MY-APP-A/20240517/MY-APP-A-6b9bbc6864-plgns-16.log
  • /usr/share/logstash/ingest_data/MY-APP-B/20240517/MY-APP-B-JOBNAMEY-145-manual-w68qx-13.log
  • /usr/share/logstash/ingest_data/MY-APP-C/20240517/MY-APP-C-vsgxg-16.log

The text strings I want to extract are:

  • MY-APP-<letter> (e.g. MY-APP-A or MY-APP-B) and save it in the applicationName field
  • JOBNAME<letter> (e.g. JOBNAMEX or JOBNAMEY) and save it in the jobName field, but only if applicationName is MY-APP-B.

In the logstash configuration file I created a filter:

  mutate { copy => { "[log][file][path]" => "[@metadata][path]" } }
  mutate { split => { "[@metadata][path]" => "/" } }
  mutate { add_field => { "applicationName" => "%{[@metadata][path][5]}" } }
  if [applicationName] == "MY-APP-B" {
    mutate { copy => { "[log][file][path]" => "[@metadata][jobNameFull]" } }
    mutate { split => { "[@metadata][jobNameFull]" => "-" } }
    mutate { add_field => { "jobName" => "%{[@metadata][jobNameFull][3]}" } }
  }

It basically works, but it is not resistant to changing the location of log files (I suppose that /usr/share/logstash part could be changed in future). Moreover, I think the { split => { "[@metadata][jobNameFull]" => "-" } expression is very fragile because it relies on any - appearing anywhere in the log.file.path field.

Is there a way to make this filter more universal and safe?

  1. Is it possible to create a filter that will assign a value to the applicationName field from what is between the strings /ingest_data/ and the next /? Or maybe would be better string between / and /YYYYMMDD (20240517 in my case). Not sure which is the best...
  2. Is it possible to create a filter that will assign a value to the jobName field from what is between MY-APP-B (which contains -) and the next character -?

Maybe some regex? But I'm not sure, how to use it with conjunction with field references.

I would keep it simple and just use grok with regexps that just match the part of the path that you want

    grok { "match" => { "[log][file][path]" => "/(?<appName>[A-Za-z\-]+)/%{NUMBER:date}/" } }
    if [appName] == "MY-APP-B" {
        grok { "match" => { "[log][file][path]" => "/MY-APP-B-(?<jobName>JOBNAME\w)-" } }
    }
1 Like

Thank you @Badger, it works.

The simplicity of your solution is brilliant!

Best regards.