Can't get GeoIp data in Elasticsearch/Fluentd


(Maciej Przygrodzki) #1

Hello, i'm using Fluentd to push logs into Elasticsearch. The main goal is to have nginx-ingress logs published into Elasticsearch with geoip location so I can visualize metrics on dashboard with map based on IP's. I'm using fluent-plugin-geoip-1.2.0 as exporter for this kind of data. I'm already publishing nginx-ingress logs to Elasticsearch in json format and it works as it should, i can search the data in Kibana output looks like below:

{"proxy_protocol_addr": "89.72.XXX.XXX","remote_addr": "89.72.XXX.XXX", "proxy_add_x_forwarded_for": "89.72.110.107", "request_id": "8a49870b2cee49911b0793ec97226036","remote_user": "", "time_local": "17/Jul/2018:20:42:15 +0000", "request" : "GET /app/kibana HTTP/1.1", "status": "200", "vhost": "kibana.staging.domain.com","body_bytes_sent": "14225", "http_referer": "http://kibana.staging.domain.com/", "http_user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36", "request_length" : "480", "request_time" : "0.246", "proxy_upstream_name": "monitoring-kibana-80", "upstream_addr": "100.100.72.251:5601", "upstream_response_length": "14196", "upstream_response_time": "0.244", "upstream_status": "200"}

Here's my config for geoip data:

<filter kubernetes.**>
    @type geoip
    # Specify one or more geoip lookup field which has ip address (default: host)
    # in the case of accessing nested value, delimit keys by dot like 'host.ip'.
    geoip_lookup_keys  remote_addr
    # Specify optional geoip database (using bundled GeoLiteCity databse by default)
    geoip_database    "/var/lib/gems/2.3.0/gems/fluent-plugin-geoip-1.2.0/data/GeoLiteCity.dat"
    # Set adding field with placeholder (more than one settings are required.)
    <record>
      city            ${city["remote_addr"]}
      lat             ${latitude["remote_addr"]}
      lon             ${longitude["remote_addr"]}
      country_code3   ${country_code3["remote_addr"]}
      country         ${country_code["remote_addr"]}
      country_name    ${country_name["remote_addr"]}
      dma             ${dma_code["remote_addr"]}
      area            ${area_code["remote_addr"]}
      region          ${region["remote_addr"]}
      geoip           '{"location":[${longitude["remote_addr"]},${latitude["remote_addr"]}]}'
    </record>
    # To avoid get stacktrace error with `[null, null]` array for elasticsearch.
    skip_adding_null_record  true
    # Set log_level for fluentd-v0.10.43 or earlier (default: warn)
    @log_level         info
    # Set buffering time (default: 0s)
    # flush_interval    1s
</filter>

I've checked mappings using "GET /_template" and "GET /_template/logstash-*" and in both cases i see mapping are created like below

"geoip": {
            "dynamic": true,
            "properties": {
              "ip": {
                "type": "ip"
              },
              "location": {
                "type": "geo_point"
              },
              "latitude": {
                "type": "half_float"
              },
              "longitude": {
                "type": "half_float"
              }

What's wrong or what i'm missing to get this working as expected, and see geoip data ?


(Mark Walkom) #2

I don't know fluentd, but what does the output document it creates look like?


(Maciej Przygrodzki) #3

I believe i'm missing something in output.

output.conf: |
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    <filter **>
      @type grep
      <exclude>
        key log
        pattern ElasticsearchErrorHandler
      </exclude>  
    </filter>    

<match **>
  @id elasticsearch
  @type elasticsearch
  @log_level info
  include_tag_key true
  host "#{ENV['OUTPUT_HOST']}"
  port "#{ENV['OUTPUT_PORT']}"
  logstash_format true
  <buffer>
    @type file
    path /var/log/fluentd-buffers/kubernetes.system.buffer
    flush_mode interval
    retry_type exponential_backoff
    flush_thread_count 2
    flush_interval 5s
    retry_forever
    retry_max_interval 30
    chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
    queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
    overflow_action block
  </buffer>
</match>

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.