My main goal is to parse apache airflow logs into particular fields using logstash, feed it into elasticsearch and visualise them using kibana. There is no particular grok pattern available for airflow logs. I'm fairly new to elk stack. Need any help possible to parse important info from airflow logs.
i've been trying regex because i wasn't able to find a suitable grok filter to achieve the required result.
\*\s\w*\s\w*\s\w*\:\s\/[\w]*\/[\w]*\/[\w]*\/(?<dag_name>[\w]*)\/(?<task_id>[\w]*)\/(?<trigger_time>[\w\-\:\+]*)\/(?<file>[\w\.]*)| \[(?<start_time>[\d\-\s\:\,]*)\]\s\{(?<runner>[\w\.]*):(?<line_no>[\d]*)\}\s(?<level>[\w]*)\s\-\s(?<message>[\w\s\<\:\.\-\+]*)
I want to mainly parse the 1st,2nd and last line of the airflow log. The fields i want are:
- dag_name
- task_name
- trigger_time
- No_of_runs
- start_time
- end_time
- message( task exited with code 0)