I'm not following, but let's take syslog messages as an example. Logstash ships with some grok patterns for syslog messages, like SYSLOGBASE which is defined like this:
$ grep SYSLOGBASE /opt/logstash/patterns/grok-patterns
SYSLOGBASE %{SYSLOGTIMESTAMP:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:
When used like
filter {
grok {
match => ["message", "%{SYSLOGBASE}"]
}
}
Logstash will attempt the match and extract the fields timestamp
, program
, pid
, logsource
, facility
, priority
and a few others from the message
field. Then we could end up with a message looking like this:
{
"logsource": "somehostname",
"timestamp": "Jun 1 07:50:01",
"program": "CRON",
"pid": "22912",
"message": "Jun 1 07:46:01 somehostname CRON[22912]: (root) CMD ( /path/to/some/program > /dev/null 2>&1)"
}
By changing the filter to
filter {
grok {
match => ["message", "%{SYSLOGBASE} %{GREEDYDATA:message}"]
overwrite => ["message"]
}
}
we capture the actual message part of the original message and save it back into the message field, yielding:
{
"logsource": "somehostname",
"timestamp": "Jun 1 07:50:01",
"program": "CRON",
"pid": "22912",
"message": "(root) CMD ( /path/to/some/program > /dev/null 2>&1)"
}
Now things are starting to look useful.