Why can't this match successfully?


I am trying to mark missing fields in the logs, but some log entries with missing fields cannot match successfully, which prevents them from being marked. What should I do? Is there something wrong with my configuration file?

The original log format is:
2024-10-15T00:00:03.172528+08:00 yp-VMware-Virtual-Platform systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1346 (rsyslogd) on client request.
How can I successfully match and check for missing fields in each log entry?
Please, everyone, help me!

Please show us the log lines you are trying to parse and the pattern you are using to parse them so that we can reproduce the problem. Do not post pictures of text, just post text. Pictures are not searchable, not pasteable, and some folks may not even be able to see them.

(?:%{TIMESTAMP_ISO8601:timestamp})? (?:%{DATA:host})? (?:%{DATA:process_name})?(?:\[%{NUMBER:pid}\])?: (?:%{GREEDYDATA:log_message})?

The original log format is:
2024-10-15T00:00:03.172528+08:00 yp-VMware-Virtual-Platform systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1346 (rsyslogd) on client request.

I am trying to parse logs in the following format:

yp-VMware-Virtual-Platform systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1346 (rsyslogd) on client request.
2024-10-15T00:00:03.172528+08:00 yp-VMware-Virtual-Platform systemd[1]:
2024-10-15T00:00:03.172528+08:00 systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1346 (rsyslogd) on client request.

I have set each field as optional so that the match can succeed even when fields are missing.Perhaps the error is related to the spaces?

Yes, if the space and log message after the process name and pid are optional then they need to be optional in the pattern. You could change

(?:%{DATA:process_name})?(?:\[%{NUMBER:pid}\])?: (?:%{GREEDYDATA:log_message})?

to

(?:%{DATA:process_name})?(?:\[%{NUMBER:pid}\])?:(?: %{GREEDYDATA:log_message})?

Then

input { generator { count => 1 lines => [
    '2024-10-15T00:00:03.172528+08:00 yp-VMware-Virtual-Platform systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1346 (rsyslogd) on client request.',
    '2024-10-15T00:00:03.172528+08:00 yp-VMware-Virtual-Platform systemd[1]:',
    '2024-10-15T00:00:03.172528+08:00 systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 1346 (rsyslogd) on client request.'
] } }
output { stdout { codec => rubydebug { metadata => false } } }
filter {
    grok { match => { "message" => "(?:%{TIMESTAMP_ISO8601:timestamp})? (?:%{DATA:host})? (?:%{DATA:process_name})?(?:\[%{NUMBER:pid}\])?:(?: %{GREEDYDATA:log_message})?" } }
}

will parse all three messages.

Thank you so much!