Hello everyone,
I want to view the data regarding nginx logs in Kibana so I've set up a flow that looks like:
Filebeat > Logstash > Elasticsearch
Filebeat is in the same machine as the nginx I want the logs from and Logstash and Elasticsearch are in another different machine.
My filebeat config is the following:
File filebeat.yml
filebeat.inputs:
- type: log
  enabled: false
  paths:
    - /path/to/my/logs/*.log
  exclude_files: ['.gz$']
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
setup.template.settings:
  index.number_of_shards: 1
setup.kibana:
output.logstash:
  hosts: ["my.elastic.machine:5440"]
  ssl.enable: true
  ssl.certificate_authorities: ["/my/ca/cert.crt"]
  ssl.certificate: "/my/client/cert.crt"
  ssl.key: "/my/client/key.key"
  ssl.supported_protocols: "TLSv1.2"
processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~
And I have the nginx module enabled with the following configuration:
File modules.d/nginx.yml
- module: nginx
  access:
    enabled: true
    var.paths: ["/path/to/custom/logs/access.log*"]
  error:
    enabled: true
    var.paths: ["/path/to/custom/logs/error.log*"]
In the Logstash/Elastic machine I have the following configs:
File logstash.yml
path.data: /var/lib/logstash
path.config: /etc/logstash/conf.d
path.logs: /var/log/logstash
The Nginx config for parsing the logs:
File nginx.conf
input {
    beats {
        ...
    }
}
filter {
  grok {
    patterns_dir => ["/etc/logstash/conf.d/patterns"]
    match => { "message" => "%{NGINX_ACCESS}" }
    remove_tag => [ "_grokparsefailure" ]
    add_tag => [ "nginx_access" ]
  }
  mutate {
    convert => ["response", "integer"]
    convert => ["bytes", "integer"]
    convert => ["responsetime", "float"]
  }
  date {
    match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
    remove_field => [ "timestamp" ]
  }
  useragent {
    source => "agent"
  }
  geoip {
    database => "/path/to/GeoLite2-City.mmdb"
    source => "remote_addr"
  }
}
output {
  ...
}
As you can see I reference a patterns dir, in that dir there is only one file:
File nginx.pattern
NGINX_ACCESS %{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{HTTPDATE:time_local}\] \"%{DATA:request}\" %{INT:status} %{NUMBER:bytes_sent} \"%{DATA:http_referer}\" \"%{DATA:http_user_agent}\"
With all that config, the logs arrive to Elasticsearch to the filebeat index. But the fields are not correct mapped, precisely the GeoIP fields:
Filebeat index mapping
{
  "mapping": {
    "properties": {
      ...
      "geoip": {
        "properties": {
          "city_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "continent_code": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country_code2": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country_code3": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "dma_code": {
            "type": "long"
          },
          "ip": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "latitude": {
            "type": "float"
          },
          "location": {
            "properties": {
              "lat": {
                "type": "float"
              },
              "lon": {
                "type": "float"
              }
            }
          },
          "longitude": {
            "type": "float"
          },
          "postal_code": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "region_code": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "region_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "timezone": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      },
      ...
  }
}
As you can see, the geoip.location field is not mapped as a geo_point. If I go to Kibana to the "Discover" tab, the geoip filed is not mapped as a geo_point.
I don't know if this can be related to the problem, but I'm getting the following error in the logstash service:
[ERROR][logstash.filters.useragent][main] Uknown error while parsing user agent data {:exception=>#<TypeError: cannot convert instance of class org.jruby.RubyHash to class java.lang.String>, :field=>"agent", :event=>#<LogStash::Event:0x638caf98>}
I have a lot of that error, every of the referencing to the "agent" field, that's why I supposed that is not relevant to the geoip problem.
Thank you very much in advance.