Adding new fields from a match [solved]

Hi. New to logstash. One thing I don't understand is that when a you match a line with something like %{GREEDYDATA:myfield1}, does that automatically create a corresponding field in elasticsearch or do I need to add it on the elasticsearch or use an add_field directive under the grok config?

That will add a field to the event, so you do not need add_field. Provided that your index has dynamic field mapping enabled (which is on by default) that will also cause it to be added to the document in elasticsearch.

Thanks for answering. Looking through the documentation you linked to, I couldn't find how this dynamic mapping value gets set, but I also couldn't find a setting for it in any of my config under /etc so I'm assuming that it's set to the default.

That being the case, i'm not sure why my config isn't working. I'm trying to match crontab command logs and separate out the cron runas user, the command and so on into their own fields. Here is my config:

input {
 beats {
   port => 5044
   ssl => true
   ssl_certificate => "/etc/logstash/logstash.crt"
   ssl_key => "/etc/logstash/logstash.key"
   congestion_threshold => "40"
 }
}


filter {
 if [type] == "syslog" {
   grok {
      match => { "message" => [
                "(?:%{SYSLOGTIMESTAMP:timestamp}|%{TIMESTAMP_ISO8601:timestamp}) %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: \(%{GREEDYDATA:cron_user}\) CMD \(%{GREEDYDATA:cron_cmd}\)"
                ] }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
   }
   syslog_pri { }
   date {
     match => [ "timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
   }
 }
}

output {
 elasticsearch {
  hosts => localhost
    #index => "%{[@metadata][beat])-%{+YYYY.MM.dd}"
    index => "logstash-%{+YYYY.MM.dd}"
 }
 stdout {
    codec => rubydebug
 }
}

I'm seeing new syslog entries in Kibana, but they don't have cron_user, cron_cmd, syslog_program fields or even the received_at or received_from fields. I tested my match expression using one of the online match testing sites and it matches the lines from my syslog file, so I'm not sure where I'm going wrong. I noticed that the if [type] == "syslog" line may not be correct as in Kibana, there is not exact field called "type", there is an input.type, which is set to "log" in the entries I am seeing and an _type field which is set to "logs". Could that be the problem?

Definitely.

Ok. You say that but I copied the config probably from this example while learning about an elastic stack setup, so I can't be faulted too much. :wink: It looks like I took the filter section without adding the corresponding type line in the input section and that was the problem?

I tried using if [source] == "/var/log/syslog" and that worked, but that seems less than ideal. Is there a best field to select on here? event.dataset? fileset.name?

Right. In that example the type field is being set by the tcp/udp inputs. You cannot do that with a beats input, because filebeat will already have set type. In this situation I would use the fields (and possibly fields_under_root) options to add a field to the events that tells logstash what type of processing it should do.

Thanks, that fixed my problem of course. For the benefit of others, here is my filebeat config now:

- type: log
  enabled: true
  paths:
    - /var/log/syslog
    - /var/log/auth.log
  fields:
        type: syslog
  fields_under_root: true

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.