What does the field in Grok "message" do?

michaellizhou · June 9, 2015, 8:24pm

Hi first post and new to the ELK system. So I am working on parsing through log4j logs and I cannot find what is the purpose of the "message" field. For example:

filter {
  grok {
    match => {"message =>

I am so confused on if it is just a naming convention or standard. This is probably a really basic question but I need to find out before moving on. Is message just something I can reference back to if I need to?

Thanks.
Mike

magnusbaeck · June 9, 2015, 8:29pm

The message field is like a default field. It's where most input plugins place the payload that they receive from the network, read from a file, or whatever. So no, it's not just a convention.

In many log formats the message field starts with a timestamp, maybe a severity level, possibly a hostname, and so on, and ends with the actual message. In such cases one typically extract the timestamp etc into fields of their own and remove them from the message field. In other cases like HTTP logs there is no free-text message.

michaellizhou · June 9, 2015, 8:37pm

Oh alright so the message will hold the bulk of default fields. Does this mean that after I write my message field to store say the timestamp, level, groupId...etc underneath I would filter out more of the log message. i.e

filter {
  grok{
    match => {"message" => {SYSLOGTIMESTAMP:installTime}
    match => {"someotherfield" => {more grok or regex}

Thanks for help!

magnusbaeck · June 9, 2015, 8:56pm

I'm not following, but let's take syslog messages as an example. Logstash ships with some grok patterns for syslog messages, like SYSLOGBASE which is defined like this:

$ grep SYSLOGBASE /opt/logstash/patterns/grok-patterns
SYSLOGBASE %{SYSLOGTIMESTAMP:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}:

When used like

filter {
  grok {
    match => ["message", "%{SYSLOGBASE}"]
  }
}

Logstash will attempt the match and extract the fields timestamp, program, pid, logsource, facility, priority and a few others from the message field. Then we could end up with a message looking like this:

{
  "logsource": "somehostname",
  "timestamp": "Jun  1 07:50:01",
  "program": "CRON",
  "pid": "22912",
  "message": "Jun  1 07:46:01 somehostname CRON[22912]: (root) CMD ( /path/to/some/program > /dev/null 2>&1)"
}

By changing the filter to

filter {
  grok {
    match => ["message", "%{SYSLOGBASE} %{GREEDYDATA:message}"]
    overwrite => ["message"]
  }
}

we capture the actual message part of the original message and save it back into the message field, yielding:

{
  "logsource": "somehostname",
  "timestamp": "Jun  1 07:50:01",
  "program": "CRON",
  "pid": "22912",
  "message": "(root) CMD ( /path/to/some/program > /dev/null 2>&1)"
}

Now things are starting to look useful.

michaellizhou · June 9, 2015, 10:17pm

I think I am starting to understand more now. So the SYSLOGBASE from your example picks up most of the overhead that comes with the log. Message takes everything afterwards and from your example you want message to be "(root) CMD..." Does this mean I can call message anything else i.e. data or msg?

magnusbaeck · June 10, 2015, 3:48am

Sure. There might be specific exceptions for some output plugins, but what you call the output fields is generally up to you.

Topic		Replies	Views
Parse "message" field on Syslog Logstash	4	1517	March 28, 2021
Parsing data from the message field of incoming json Logstash	2	628	September 14, 2017
Timestamps filter Logstash	8	487	March 18, 2020
Parsing Logstash Message Fields Logstash	6	434	April 26, 2021
Grok Pattern Help with message parsing Logstash	5	1455	July 6, 2017

What does the field in Grok "message" do?

Related topics