Missing fields Filebeat -> Logstash -> Elasticsearch


(Matt Shields) #1

I'm using Filebeat on Windows to send some custom logs to a custom index on an Elasticsearch server, but I'm sending them through Logstash so I can apply a custom filter. Previously I tried sending them directly from Filebeat to ES and using an Ingest Pipeline. This didn't work.

So the problem I'm having is the logs are going into my index, but I'm losing the following fields. When using Filebeat to import Apache logs on the Linux server directly to ES they are in the filbeat-* index. For for Windows servers using the fan-* index, they are not.

  • fields.application
  • fields.environment
  • fields.ip_address
  • fields.production
  • meta.cloud.availability_zone
  • meta.cloud.instance_id
  • meta.cloud.machine_type
  • meta.cloud.provider
  • meta.cloud.region

Below are the configs.

Windows server Filebeat config:

filebeat.prospectors:
- input_type: log
  paths:
    - C:\fan\logs\*.log
  exclude_files: ['^.*\-all\-.*\.log$']
  multiline.pattern: '^[[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after

processors:
- add_cloud_metadata:

fields:
  application: apigateway
  environment: stage
  production: No
  ip_address: 10.10.88.195

fields_under_root: true

output.logstash:
  hosts: ["logstash.ourdomain.com:5326"]

Logstash server:

input {
  beats {
    port => "5326"
    host => "0.0.0.0"
    codec => multiline {
      #pattern => "^[[0-9]{4}-[0-9]{2}-[0-9]{2}"
      pattern => "^\[%{TIMESTAMP_ISO8601}\]"
      negate => true
      what => "previous"
    }
  }
}
# The filter part of this file is commented out to indicate that it is
# optional.
filter {
  grok {
    match => {
      "message" =>
        "\[%{TIMESTAMP_ISO8601:fan.access.time}?\]\|%{DATA:fan.access.assembly_version}?\|%{DATA:fan.access.logger}?\|%{DATA:fan.access.log_level}?\|%{IPORHOST:fan.access.remote_ip}?\|%{IPORHOST:fan.access.remote_ip_internal}?\|%{DATA:fan.access.referrer}?\|%{DATA:fan.access.agent}?\|%{DATA:fan.access.method}?\|%{DATA:fan.access.url}?\|%{DATA:fan.access.user_id}?\|%{DATA:fan.access.tenant_id}?\|%{DATA:fan.access.correlation_id}?\|%{DATA:fan.access.log_message}?\|%{GREEDYDATA:fan.access.log_exception}?"
    }
  }
    mutate {
      add_field => { "read_timestamp" => "%{@timestamp}" }
    }
    date {
      match => [ "[fan][access][time]", "[YYYY-mm-dd H:m:s.SSSS]" ]
      remove_field => "[fan][access][time]"
      target => "@timestamp"
    }
    useragent {
      source => "[apache2][access][agent]"
      target => "[apache2][access][user_agent]"
      remove_field => "[apache2][access][agent]"
    }

    geoip {
      source => "[apache2][access][remote_ip]"
      target => "[apache2][access][geoip]"
    }
}
output {
    elasticsearch {
      hosts => ["ops-log001.ourdomain.com:9200","ops-log002.ourdomain.com:9200","ops-log003.ourdomain.com:9200"]
      index => ["fan-%{+yyyy-MM-dd}"]
      manage_template => false
      document_type => 'log'
    }
    stdout { codec => rubydebug }
}

Here's what I see imported into the fan-* index:

{
  "_index": "fan-2017-09-25",
  "_type": "log",
  "_id": "AV66c5ILKqZrV5vGc2bx",
  "_version": 1,
  "_score": null,
  "_source": {
    "fan.access.logger": "IdentityServer4.AccessTokenValidation.Infrastructure.NopAuthenticationMiddleware",
    "fan.access.method": "GET",
    "fan.access.log_message": "Bearer was not authenticated. Failure message: No token found.",
    "fan.access.remote_ip_internal": "10.10.10.10",
    "read_timestamp": "2017-09-25T19:11:29.792Z",
    "message": "[2017-09-25 19:11:11.3854]|1.1|IdentityServer4.AccessTokenValidation.Infrastructure.NopAuthenticationMiddleware|INFO|10.10.90.126|10.10.10.10||Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.29 Safari/537.36|GET|'http://apigateway.ourdomain.com/v1/home'||||Bearer was not authenticated. Failure message: No token found.|",
    "fan.access.time": "2017-09-25 19:11:11.3854",
    "tags": [
      "_geoip_lookup_failure"
    ],
    "fan.access.agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.29 Safari/537.36",
    "@timestamp": "2017-09-25T19:11:29.792Z",
    "fan.access.assembly_version": "1.1",
    "fan.access.remote_ip": "10.10.90.126",
    "@version": "1",
    "fan.access.log_level": "INFO",
    "fan.access.url": "'http://apigateway.ourdomain.com/v1/home'"
  },
  "fields": {
    "@timestamp": [
      1506366689792
    ]
  },
  "highlight": {
    "message": [
      "[2017-09-25 19:11:11.3854]|1.1|IdentityServer4.AccessTokenValidation.Infrastructure.NopAuthenticationMiddleware|INFO|10.10.90.126|10.10.10.10||Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.29 Safari/537.36|GET|'http://apigateway.ourdomain.com/v1/@kibana-highlighted-field@home@/kibana-highlighted-field@'||||Bearer was not authenticated. Failure message: No token found.|"
    ]
  },
  "sort": [
    1506366689792
  ]
}

(Magnus B├Ąck) #2

Don't use a multiline codec in the beats input. Always apply multiline configurations as close to the source as possible, in this case in Filebeat.


(Matt Shields) #3

Thanks that seems to have worked. Most things are getting put into the proper fields. The things that aren't seem to be multi-line issues. So I'll troubleshoot that next.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.