Grok pattern for stashing Ansible logs

Hello,

I am new to Elasticsearch. I am on Elasticsearch 7.17.1.

Below are the logs that I am trying to parse.

2022-03-08 14:52:50,672 p=1654827 u=ansible n=ansible | TASK [lvm : Formatting xfs filesystem] **
2022-03-08 14:52:50,672 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-1]
2022-03-08 14:52:51,816 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-2]
2022-03-08 14:52:52,030 p=1654827 u=ansible n=ansible | TASK [lvm : Backup original fstab]**
2022-03-08 14:52:52,031 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-1]
2022-03-08 14:52:52,049 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-2]

The problem that I am facing is co-relating the task name with hostname events. i.e the lines below it. In the above logs, the lines below each of these tasks are exactly the same for both the tasks. (except for timestamps), giving me no option of telling logstash which event belongs to which task.

How can I relate the events on each host to the specific task? I just want to add a field called 'task_name' for each of the hosts, which will help to categorise which event belongs to which task.

Thanks in Advance.

I have checked on https://grokdebug.herokuapp.com, this grok pattern should work:
%{TIMESTAMP_ISO8601:timer} p=%{POSINT:processid} u=%{DATA:appu} n=%{DATA:appn} \| %{DATA:action}(\s|\:\s)\[%{DATA:activity}\]

Hello Rios

First of all thanks for your time and reply. Actually, I already have the
GROK pattern for all the logs that I pasted above and is working fine.

My grok pattern is able to get the task name as -> Formatting xfs filesystem , from the first
line and I am storing it in a field called task_name.

Now my real problem is how can i add task_name as a field to the next 2
events shown below,

2022-03-08 14:52:50,672 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-1]
2022-03-08 14:52:51,816 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-2]

Similarly, I have the task_name field from the 4th event and I want to
add this as a field in the next 2 event following it.

I tried options such as mutate with add_field, but I am not able to
achieve this.

In simple terms, i want to grab the task name from the 1st event
( which I have achieved ), and then add that as a field to the next 2 events following it.
The same process needs to be followed for every task and
the subsequent events below it.

Thanks in Advance.

Try with aggregate filter.

I stumbled upon this 6 year old post https://discuss.elastic.co/t/keeping-global-variables-in-ls/39908 which describes the exact same scenario as mine.

After following this post, I was able to get partial success using ruby filter.

My progress. so far:

  1. Using ruby filter, store the value of task_name field in a variable called @@taskname that will persist for later user.

  2. Using GROK, search for events where you want to insert the above field.

  3. Field inserted in all the lines that match in step 2.

What I am stuck at:

  1. "_grokparsefailure" tag gets inserted in all events, even though the patterns have matched and the task_name field was inserted.

  2. When I run Logstash again, then all 4 events get the same task_name inserted. Actual behavior should be, 2 events get task_name as "Formatting xfs filesystem" and the other 2 should get the. 2nd task name. If I run it again, I get one of the events getting a null value. Any ideas for the reason behind this random behaviour?

I have made sure to set the Logstash workers to '1' for thread safety.

Below is my conf

## Parse Logs
input {
  file {
    path => "/var/log/dummy.log"
    start_position => "beginning"
    sincedb_path  => "/dev/null"
  }
}

filter {
  grok {
    patterns_dir => ["./patterns"]
    match => { "message" => "%{GROK_PATTERN}TASK%{SPACE}\[%{WORD:role}%{SPACE}\:%{SPACE}%{GREEDYDATA:task_name}\]" }
  }

## Store the value in @@taskname
  ruby {
       init => '@@taskname = ""'
       code => '
         if event.get("task_name")
            @@taskname = event.get("task_name")
         end'
  }

## Match events where the @@taskname should be added
  grok {
    patterns_dir => ["./patterns"]
    match => { "message" => "%{GROK_PATTERN}%{WORD:task_status}\:%{SPACE}\[%{HOSTNAME:hostname}.*" }
  }

## Insert the field
  if [task_status] =~ "changed" {
    ruby { code => 'event.set("task_name", @@taskname)' }
  }
}

output {
  elasticsearch {
    hosts => ["TESTVM-1:9200"]
    index => "dummy_logs"
  }
  stdout {}
} 

This is your log:
2022-03-08 14:52:50,672 p=1654827 u=ansible n=ansible | TASK [lvm : Formatting xfs filesystem]
2022-03-08 14:52:50,672 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-1]
2022-03-08 14:52:51,816 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-2]
2022-03-08 14:52:52,030 p=1654827 u=ansible n=ansible | TASK [lvm : Backup original fstab]
2022-03-08 14:52:52,031 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-1]
2022-03-08 14:52:52,049 p=1654827 u=ansible n=ansible | changed: [DUMMY-HOST-2]

Can you write how would like JSON output for 4 lines?

{ "time": "2022-03-08 14:52:50,672", "processid": "1654827", "appu":"ansible", "appn":"ansible", "action": "TASK", "activity": "lvm : Formatting xfs filesystem"
...
}

Appreciate your response. Below is how the JSON output should be. I have added the JSON output for all 6 lines.


1st Line

{
  "timer": [
    [
      "2022-03-08 14:52:50,672"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "14",
      null
    ]
  ],
  "MINUTE": [
    [
      "52",
      null
    ]
  ],
  "SECOND": [
    [
      "50,672"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "processid": [
    [
      "1654827"
    ]
  ],
  "appu": [
    [
      "ansible"
    ]
  ],
  "appn": [
    [
      "ansible"
    ]
  ],
  "role": [
    [
      "lvm "
    ]
  ],
  "task_name": [
    [
      " Formatting xfs filesystem"
    ]
  ]
}


2nd and 3rd Line (  Please note that the task_name field doesn't exist in 2nd and 3rd line, it should be populated/added in 2nd and 3rd line by referring to the 1st line and would look like below )

{
  "timer": [
    [
      "2022-03-08 14:52:50,672"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "14",
      null
    ]
  ],
  "MINUTE": [
    [
      "52",
      null
    ]
  ],
  "SECOND": [
    [
      "50,672"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "processid": [
    [
      "1654827"
    ]
  ],
  "appu": [
    [
      "ansible"
    ]
  ],
  "appn": [
    [
      "ansible"
    ]
  ],
  "action": [
    [
      "changed"
    ]
  ],
  "activity": [
    [
      "DUMMY-HOST-1"
    ]
  ],
  "task_name": [
    [
      " Formatting xfs filesystem"
    ]
  ]
}


{
  "timer": [
    [
      "2022-03-08 14:52:51,816"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "14",
      null
    ]
  ],
  "MINUTE": [
    [
      "52",
      null
    ]
  ],
  "SECOND": [
    [
      "51,816"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "processid": [
    [
      "1654827"
    ]
  ],
  "appu": [
    [
      "ansible"
    ]
  ],
  "appn": [
    [
      "ansible"
    ]
  ],
  "action": [
    [
      "changed"
    ]
  ],
  "activity": [
    [
      "DUMMY-HOST-2"
    ]
  ],
  "task_name": [
    [
      " Formatting xfs filesystem"
    ]
  ]
}


4th Line 

{
  "timer": [
    [
      "2022-03-08 14:52:52,030"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "14",
      null
    ]
  ],
  "MINUTE": [
    [
      "52",
      null
    ]
  ],
  "SECOND": [
    [
      "52,030"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "processid": [
    [
      "1654827"
    ]
  ],
  "appu": [
    [
      "ansible"
    ]
  ],
  "appn": [
    [
      "ansible"
    ]
  ],
  "role": [
    [
      "lvm"
    ]
  ],
  "task_name": [
    [
      "Backup original fstab"
    ]
  ]
}



5th and 6th line (  Please note that the task_name field doesn't exist in 5th and 6th line, it should be populated/added in 5th and 6th line by referring to the 4th line and would look like below )

{
  "timer": [
    [
      "2022-03-08 14:52:52,031"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "14",
      null
    ]
  ],
  "MINUTE": [
    [
      "52",
      null
    ]
  ],
  "SECOND": [
    [
      "52,031"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "processid": [
    [
      "1654827"
    ]
  ],
  "appu": [
    [
      "ansible"
    ]
  ],
  "appn": [
    [
      "ansible"
    ]
  ],
  "action": [
    [
      "changed"
    ]
  ],
  "activity": [
    [
      "DUMMY-HOST-1"
    ]
  ],
  "task_name": [
    [
      "Backup original fstab"
    ]
  ]
}


{
  "timer": [
    [
      "2022-03-08 14:52:52,049"
    ]
  ],
  "YEAR": [
    [
      "2022"
    ]
  ],
  "MONTHNUM": [
    [
      "03"
    ]
  ],
  "MONTHDAY": [
    [
      "08"
    ]
  ],
  "HOUR": [
    [
      "14",
      null
    ]
  ],
  "MINUTE": [
    [
      "52",
      null
    ]
  ],
  "SECOND": [
    [
      "52,049"
    ]
  ],
  "ISO8601_TIMEZONE": [
    [
      null
    ]
  ],
  "processid": [
    [
      "1654827"
    ]
  ],
  "appu": [
    [
      "ansible"
    ]
  ],
  "appn": [
    [
      "ansible"
    ]
  ],
  "action": [
    [
      "changed"
    ]
  ],
  "activity": [
    [
      "DUMMY-HOST-2"
    ]
  ],
  "task_name": [
    [
      "Backup original fstab"
    ]
  ]
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.