New field to filebeat.tempalte.default.json

Hi guys, i am new to elk architecture and trying to understand how can i add extra fields to my default filebeat.tempalte.default.json search pattern that i loaded on ELK server by executing: curl -XPUT 'http://localhost:9200/_template/filebeat?pretty' -d@filebeat.template.default.json. Now, when i go to Kibana i can see 150 fields in total for filebeat index but available for parsing only ~15. My question how can i add extra field to be able to sort by container name (CAPITALIZED LETTER BELOW) from this log string:

2017-10-17T06:36:25.363811+00:00 compass-staging DOCKER/COMPASS_PROCESSORS_1[1354]: [2017-10-17 06:36:25,363: DEBUG/ForkPoolWorker-1] compass.processing.scheduled_tasks.launch_processors[231c2537-3736-41e6-9552-a215b0b6507c]: Start launch_processors with label.

I don't really understand how this filed definition in filebeat.template.default.json template understands and parse only message part from the log entry:

"message": {
"norms": false,
"type": "text"
},

or date for example:

"@timestamp": {
"type": "date"
},

Thank you in advance.

This is an index template. It defines what data types are used for each field (e.g. offset is a long or @timestamp is a date).

That's because it doesn't define how the message is parsed. If you want to parse the message to extract the container name you'll want to use grok from either Logstash or Ingest Node.

Thank you Andrew a lot. If you don't mind i have couple more questions. Still trying to solve that mystery for me. In Kibana i am able to see 150 fields but only 15 are available to parse through. Why other fields like for nginx or apache listed as "missing fields" and all grayed out? How to activate them? Do i need to create a new index for them to be activated? Which file do i need to modify to extract container name from the message? I was thinking i can simply add extra field to template.json file i uploaded and it will work? Or i need to modify both syslog.filer and template.json? Here is my syslog-filter and i use logstash:

root@centralizedlogging:/etc/logstash/conf.d# cat 10-syslog-filter.conf 
filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

Thank you very much Andrew in advance.

This list is all fields that Filebeat can possibly send through the use of Filebeat modules and other configuration settings. If you aren't using these modules or other processors in your config then you will like have a much smaller subset of these fields present in Elasticsearch, hence most are grayed out.

No, all data send by Filebeat should go to a single daily index. Look through the Filebeat modules link above. But do note that Filebeat modules only work when Filebeat sends directly to Elasticsearch (this is because it uses Ingest Node for parsing). If you want to do similar parsing with Logstash see these examples.

In general you don't normally need to modify the index template provided by Filebeat. It already has sane defaults and does not control how messages are parsed. You need to add a grok filter that extracts the information from the message field. For example:

# Generate data for testing:
input {
  generator {
    lines => [
      "2017-10-17T06:36:25.363811+00:00 compass-staging DOCKER/COMPASS_PROCESSORS_1[1354]: [2017-10-17 06:36:25,363: DEBUG/ForkPoolWorker-1] compass.processing.scheduled_tasks.launch_processors[231c2537-3736-41e6-9552-a215b0b6507c]: Start launch_processors with label."
    ]
    # Emit all lines 3 times.
    count => 1
  }
}

input {
  beats { port => 5045 }
}

filter {
  grok {
    match => {
      "message" => "%{SYSLOGLINE}"
    }
    overwrite => [ "message" ]
  }

  date {
    match => [ "timestamp8601", "ISO8601" ]
    remove_field => [ "timestamp8601" ]
  }

  grok {
    match => {
      "message" => "\[%{DATA:log_timestamp} %{WORD:log_level}/%{DATA:log_thread}\] %{DATA:log_class}\[%{DATA:container_id}\]: %{GREEDYDATA:message}"
    }
    overwrite => [ "message" ]
  }
}

output {
  # For debugging:
  stdout { codec => rubydebug { metadata => true } }
}

This produces the following event:

{
    "log_timestamp" => "2017-10-17 06:36:25,363:",
        "log_level" => "DEBUG",
              "pid" => "1354",
          "program" => "DOCKER/COMPASS_PROCESSORS_1",
          "message" => "Start launch_processors with label.",
        "logsource" => "compass-staging",
         "sequence" => 0,
       "@timestamp" => 2017-10-17T06:36:25.363Z,
       "log_thread" => "ForkPoolWorker-1",
         "@version" => "1",
             "host" => "d35ae0a3e8cf",
        "log_class" => "compass.processing.scheduled_tasks.launch_processors",
     "container_id" => "231c2537-3736-41e6-9552-a215b0b6507c"
}

Andrew i have my logstash configured the way i place {input,filter,output} files under /etc/logstash/conf.d/. I added today a new filter to try it out: filter {
if [type] == "syslog" {
grok {
match => { "message" => "^%{SYSLOGTIMESTAMP:date} [pid: %{INT:pid}]: %{LOGLEVEL:loglevel}:%{GREEDYDATA:filepath}:Call( by user %{QUOTEDSTRING:username})? (raised %{WORD:abort}|completed without aborting)" }
add_field => { "type" => "user_info" }
}
grok {
match => { "message" => "[.] %{IP:ipaddress} () {.} [%{DATESTAMP_FLASK:timestamp}] %{WORD:method} %{URIPATHPARAM:fullpath} => generated %{INT:bytes} bytes in %{INT:ms} msecs (HTTP/1.1 %{INT:return_code}" }
add_field => { "type" => "flask_info" }
}
}
}

but when looked at Kibana i was able to see ONLY exactly the same available fields as i had before. No extra fields that i was hoping to get with NEW grok pattern. How can i activate the new fields? My default filter looked like this:
cat /etc/logstash/conf.d/10-syslog-filter.conf:
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
} and i am able to see just this fields that didn't change at all with the new filter. Am i missing something?

Popular:

@timestamp
t @version
t _id
t _index

_score

t host
t message

offset

t source
t type
t _type
t beat.hostname
t beat.name
t beat.version
t input_type
t tags

Thank you in advance

And where to specify this fields that filebeat is sending to Kibana? Why it is sending only this fields? These are the (Popular) fields that i am seeing in Kibana:
2017/11/20 23:09:16.037434 client.go:214: DBG Publish: {
"@timestamp": "2017-11-20T23:09:11.037Z",
"beat": {
"hostname": "analytics-0",
"name": "analytics-0",
"version": "5.6.2"
},
"input_type": "log",
"message": "Nov 20 18:09:11 analytics-0 sudo: pam_unix(sudo:session): session opened for user root by ysibirski(uid=0)",
"offset": 269195,
"source": "/var/log/auth.log",
"type": "log"
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.