Extracting contents of the filename (from the source) and creating fields

Hello

I am trying to extract contents of the filename (made available by the sourcefield) and would like to add tags/create new fields based on the criterie.

For example if this was the data in the source field:
/app/appname/log/database-server-n5-server1234.domain.local.out

I would like to extract the following bits from the filename and create fields/tags for them:
instancenumber: "n5"
hostname: "server1234.domain.local.out"

I have tried the following but I keep getting config errors:
else if [type] == "AppCacheProxy" { grok { match => { "message" => "\A%{TIMESTAMP_ISO8601}%{NOTSPACE}%{SPACE}%{GREEDYDATA}" } match => { "@source_path", "app/appname/log/database-server-%{NOTSPACE:instancenumber}-%{NOTSPACE:hostname}.log" } break_on_match => false }

I also tried the following line in place of the one above:
match => { "source_path" => "app/appname/log/database-server-%{NOTSPACE:instancenumber}-%{NOTSPACE:hostname}.log" }

What is the best way to achieve this?

Regards

You should be able to do it using grok. Your pattern does however not seem to match the data. The pattern you provided ends in .log, which is not present in the sample data you provided, meaning it will not match.

Apologies, that was an error on my part - however, in the actually test environment the details are in fact correct.

The data does not seem to be parsing for some reason.

Please provide the data and configuration that is not working. It will be a lot easier for someone to help if that is available.

Hello

Sure, I'll list it all below. To summarise, I am trying to extract different parts of the filename using grok.

This is the configuration that's is not working:

     else if [type] == "APPARATUSCacheServer" {
            grok {
                    match => { "file" => "%{GREEDYDATA}-%{GREEDYDATA:instance}-%{GREEDYDATA:hostname}.out" }
                    #add_field => { "instance" => "%{instancelogs}"  }
                    #add_field => { "hostdomain" => "%{hostname}"  }
            }
    }
    else if [type] == "APPARATUSacheProxy" {
            grok {
                    match => { "file" => "%{GREEDYDATA}_%{GREEDYDATA:instance}-%{GREEDYDATA:hostname}.out" }
                    #add_field => { "proxy" => "%{proxylogs}"  }
                    #add_field => { "hostname" => "%{hostdomain}"  }
            }
    }
    else if [type] == "coherence-server-instance" {
            grok {
                    match => { "file" => "%{GREEDYDATA}-%{GREEDYDATA:instance}-%{GREEDYDATA:hostname}.out" }
                    #add_field => { "instance" => "%{instancelogs}"  }
                    #add_field => { "hostdomain" => "%{hostname}"  }
            }
    }
    else if [type] == "coherence-server-proxy" {
            grok {
                    match => { "file" => "%{GREEDYDATA}_%{GREEDYDATA:instance}-%{GREEDYDATA:hostname}.out" }
                    #add_field => { "proxy" => "%{proxylogs}"  }
                    #add_field => { "hostname" => "%{hostdomain}"  }
            }
    }
    else if [type] == "coherence-server-mbean" {
            grok {
                    match => { "file" => "%{GREEDYDATA}-%{GREEDYDATA:instance}-%{GREEDYDATA:hostname}.out" }
                    #add_field => { "proxy" => "%{proxylogs}"  }
                    #add_field => { "hostname" => "%{hostdomain}"  }
            }
    }

I have set all the [type] fields on filebeat as shown below:

    -
      paths:
        - /app/apparatus/log/logs/APPARATUSCache_APPARATUSCacheServer*
      fields:
        applog : APPARATUSCacheServer
      fields_under_root: true
      document_type : APPARATUSCacheServer

    -
      paths:
        - /app/apparatus/log/logs/APPARATUSCache_Main*
      fields:
        applog : APPARATUSCacheProxy
      fields_under_root: true
      document_type : APPARATUSCacheProxy

I hope that's a bit more helpful.

I would recommend not using multiple or leading GREEDYDATA in your grok expressions as this can be quite inefficient. Find other patterns that more accurately match your data.

Hello

I am trying to extract the contents of a filename from the source field using the expression below. I can't seem to get it to appear on elasticsearch / kibana though. Does this look OK to you?

    else if [source] =~ "GDO" {
            grok {
                    match => { "path" => "(?<app>[^_\.]+)_(?<class>[^_\.]+)_(?<member>[^_\.-]+)(?:-[^_]+)?(?:_backen(?<db>d))?\.log" }
            }
    }

PS: Source is in the format: /app/apparatus/log/logs/APPARATUS_APPARATUSCacheServer_n5-grdserver.domain.local.log