ELK stack with filebeat not mapping well GeoIP data

Hello everyone,

I want to view the data regarding nginx logs in Kibana so I've set up a flow that looks like:

Filebeat > Logstash > Elasticsearch

Filebeat is in the same machine as the nginx I want the logs from and Logstash and Elasticsearch are in another different machine.

My filebeat config is the following:

File filebeat.yml

filebeat.inputs:
- type: log
  enabled: false
  paths:
    - /path/to/my/logs/*.log
  exclude_files: ['.gz$']

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1

setup.kibana:

output.logstash:
  hosts: ["my.elastic.machine:5440"]
  ssl.enable: true
  ssl.certificate_authorities: ["/my/ca/cert.crt"]
  ssl.certificate: "/my/client/cert.crt"
  ssl.key: "/my/client/key.key"
  ssl.supported_protocols: "TLSv1.2"

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

And I have the nginx module enabled with the following configuration:

File modules.d/nginx.yml

- module: nginx
  access:
    enabled: true
    var.paths: ["/path/to/custom/logs/access.log*"]

  error:
    enabled: true
    var.paths: ["/path/to/custom/logs/error.log*"]

In the Logstash/Elastic machine I have the following configs:

File logstash.yml

path.data: /var/lib/logstash
path.config: /etc/logstash/conf.d
path.logs: /var/log/logstash

The Nginx config for parsing the logs:

File nginx.conf

input {
    beats {
        ...
    }
}

filter {
  grok {
    patterns_dir => ["/etc/logstash/conf.d/patterns"]
    match => { "message" => "%{NGINX_ACCESS}" }
    remove_tag => [ "_grokparsefailure" ]
    add_tag => [ "nginx_access" ]
  }
  mutate {
    convert => ["response", "integer"]
    convert => ["bytes", "integer"]
    convert => ["responsetime", "float"]
  }
  date {
    match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
    remove_field => [ "timestamp" ]
  }
  useragent {
    source => "agent"
  }
  geoip {
    database => "/path/to/GeoLite2-City.mmdb"
    source => "remote_addr"
  }
}

output {
  ...
}

As you can see I reference a patterns dir, in that dir there is only one file:

File nginx.pattern

NGINX_ACCESS %{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{HTTPDATE:time_local}\] \"%{DATA:request}\" %{INT:status} %{NUMBER:bytes_sent} \"%{DATA:http_referer}\" \"%{DATA:http_user_agent}\"

With all that config, the logs arrive to Elasticsearch to the filebeat index. But the fields are not correct mapped, precisely the GeoIP fields:

Filebeat index mapping

{
  "mapping": {
    "properties": {
      ...
      "geoip": {
        "properties": {
          "city_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "continent_code": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country_code2": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country_code3": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "country_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "dma_code": {
            "type": "long"
          },
          "ip": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "latitude": {
            "type": "float"
          },
          "location": {
            "properties": {
              "lat": {
                "type": "float"
              },
              "lon": {
                "type": "float"
              }
            }
          },
          "longitude": {
            "type": "float"
          },
          "postal_code": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "region_code": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "region_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "timezone": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      },
      ...
  }
}

As you can see, the geoip.location field is not mapped as a geo_point. If I go to Kibana to the "Discover" tab, the geoip filed is not mapped as a geo_point.

I don't know if this can be related to the problem, but I'm getting the following error in the logstash service:

[ERROR][logstash.filters.useragent][main] Uknown error while parsing user agent data {:exception=>#<TypeError: cannot convert instance of class org.jruby.RubyHash to class java.lang.String>, :field=>"agent", :event=>#<LogStash::Event:0x638caf98>}

I have a lot of that error, every of the referencing to the "agent" field, that's why I supposed that is not relevant to the geoip problem.

Thank you very much in advance.

I don't see what version you are using, but current version are directing us toward ECS "Elatic Common Schema". In that world, the geoip target for nginx would be "nginx.access.geoip". In the snipped (...) area of your mappings, are these geoip fields nested in nginx.access?

I suspect you either need to add a target to the geoip fileter or add custom mapping for non-nested geoip fields.

The filebeat modules include ingest pipelines that you can use for examples (and sometimes convert) to logstash filters.

I'm using the latest versions of everything. Elasticsearch, Kibana, Logstash and Filebeat are in version 7.5.1.

In the whole mapping there is no reference to any nginx field.

I've been seeing the index templates and I saw that there is a template for logstash indices that auto-maps the geoip fields. I tried to change the logstash output to a logstash-* index instead of a filebeat-* one and it worked!

The "problem" is that all of the filebeat dashboards are intended for using the filebeat-* index pattern. Should I create a template for filebeat indices? Or there is another problem elsewhere?

Here's what I think is happening; I suspect the first startup of filebeat loaded a logstash-7.5.1 template which is probably the default if no setup.template.pattern is defined. I have that defined as

setup.template.pattern: "filebeat-%{[beat.version]}*"

I think because that was the old 5.x or 6.x defaults and fits the kibana defaults. I think that change and a restart of filebeat to load the template might fix the issue. I think simply copying your logstash-7.5.1 template to a new template called filebeat-7.5.1 would have the same result.

Of course, the filebeat index will have to rollover or start a new one for the template to take effect.

Thanks so much!! It worked!

Good

One correction, logstash may have created that template, not filebeat. I've never seen this behavior before and I don't know how filebeat would have hit on that name, the filebeat.reference.yml says "filebeat-...;" is the default.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.