Add a field from one grok match to another

I've got log that starts with a jobid on our build cluster live following:
"Job <7073381> is submitted to queue "
I'd like to extract the job id from that line and add it to all other matched lines that look like this:
xmelab: *W,CUSENMP: Use -NAMEMAP_MIXGEN with name mapped instantiation 'RBR_SV_PARAM_I'

Using following grok for matching the fields:

grok {
    match => {
        "message" => [ 
        '%{GREEDYDATA} \<%{NUMBER:jobid}\> %{GREEDYDATA} \<%{WORD:queue}\>',
        '%{WORD:process}: %{DATA:log_level},%{DATA:subprocess}: %{GREEDYDATA:logMessage}'
        ]
    }
  }

I can't find the way to add value jobid from first match to other matches in the same file.

Hi,

From what you tell to us, i can assume that :

  • you are using the file input plugin. So Job <7073381> is submitted to queue and xmelab: *W,CUSENMP: Use -NAMEMAP_MIXGEN are two different lines and so differents events.
  • You can have other lines who don't respect the two grok pattern you give to us.

To do what you want i think you have to use a ruby filter to save the job_id value to use it in other lines.

grok {
  match => {
    "message" => '%{GREEDYDATA} \<%{NUMBER:jobid}\> %{GREEDYDATA} \<%{WORD:queue}\>'
  }
  # If the match is verified, then add 'source_job_id_line' to the 'tags' field.
  add_tag => [ "source_job_id_line" ] 
}

grok {
  match => {
    "message" => '%{WORD:process}: %{DATA:log_level},%{DATA:subprocess}: %{GREEDYDATA:logMessage}'
  }
  # If the match is verified, then add 'destination_job_id_line' to the 'tags' field.
  add_tag => [ "destination_job_id_line" ]
}

# If the current line contains a new job id
if 'source_job_id_line' in [tags] {
  ruby {
    # Initialization of jobId to -1 at logstash startup-time 
    init => '@@jobId = -1'
    # Put the content of the field 'jobid' to the class variable
    code => '
      @@jobId = event.get('jobid');
    '
    remove_tag => [ 'source_job_id_line' ]
  }
}

# If the current line need the job id
if 'destination_job_id_line' in [tags] {
  ruby {
    code => '
      # Adding the job id to the end of the line
      event.set('message', event.get('message') + @@jobId.to_s);
      # Adding a job id field
      event.set('jobid', @@jobId.to_s);
    '
    remove_tag => [ 'destination_job_id_line' ]
  }
}

What it do :

  • First i split the grok filter in two to have the possibility to add a different tag depending of the current line read.
  • After, depending of the value in the tags field, i edit the @@jobId variable or i use it.

I haven't tried the code so maybe it won't work on the first try. Plus, this configuration need to set the number of pipeline worker to 1 to make sure that logstash read the file line by line in the correct order

Edit: Like Badger explain in the response, we need to use class variable.

Cad.

No, each filter instance is a different instance, but you can use a variable with class scope rather than instance scope.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.