Mapping / geoip misunderstanding


(Anthony Cleaves) #1

I am continuing to struggle with the GEO_IP mapping when it comes to loading data. First, I will describe the steps I am taking in my build, then i will post my configs below that.

  • Install ElasticSearch
  • Install SearchGuard
  • Install Logstash
  • Install Kibana
  • Load ElasticSearch templates (Put using curl)
  • Install Filebeat
  • Push nginx.access logs
  • Create relevant index
  • Experience missing geo.location field
  • Attempt remap via kibana gui

Below is my mapping (Will use pastebin as it's a large set of data)

https://pastebin.com/9G9TZMpg

The important part in question is:

    "nginx": {
      "properties": {
        "access": {
          "properties": {
            "agent": {
              "norms": false,
              "type": "text"
            },
            "body_sent": {
              "properties": {
                "bytes": {
                  "type": "long"
                }
              }
            },
            "geoip": {
              "properties": {
                "continent_name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "country_iso_code": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "location": {
                  "type": "geo_point"
                }
              }
            },

I push nginx logs to Logstash using Filebeat in the following manner:

- input_type: log
  paths:
    - /var/log/nginx/access.log
  fields:
    log_type: nginx.access
  tags: nginx.access

Then here is my logstash config, to deal with that.

filter {
        if [fields][log_type] == "nginx.access" {
                grok {
                        match => { "message" => "%{HTTPD_COMBINEDLOG}" }
                }
                geoip {
                        source => "clientip"
			target => "[nginx][access][geoip]"
		}
	}
        if [fields][log_type]  == "nginx.error" {
                grok {
                        match => { "message" => "%{HTTPD20_ERRORLOG}" }
                }
                geoip {
                        source => "clientip"
			target => "[nginx][error][geoip]"
                }
        }
	if [fields][log_type] == "nginx.timings" {
                grok {
                       match => { "message" => "%{URIHOST:host} - - %{SYSLOG5424SD:timestamp} \"%{DATA:response}\" %{GREEDYDATA:message}" }
               }
                geoip {
                        source => "URIHOST"
			target => "[nginx][timings][geoip]"
                }
        }
        if [fields][log_type] == "nginx" {
                grok {
                       match => { "message" => "%{HTTPD_COMBINEDLOG}" }
               }
                geoip {
			source => "clientip"
			target => "[nginx][access][geoip]"
		}
        }
}

Now unfortunately, when I load kibana no field called geoip.location exists, I only see geo.location.lat/lon. As follows:

Screenshot from 2017-09-28 08-27-59

Then here is my mappings from curl:

https://pastebin.com/vEgGnCxh

I can clearly see geo_ip.loation set as a geopoint field.

What am I missing here, why does my data not correctly populate geoip.location?

EDIT:

Here is the raw JSON from a document in elasticsearch

{
  "_index": "nginx-2017.09.28",
  "_type": "log",
  "_id": "AV7HOdx7iyoqWt3_QQwZ",
  "_version": 1,
  "_score": null,
  "_source": {
    "request": "/",
    "agent": "\"Cloud mapping experiment. Contact research@pdrlabs.net\"",
    "offset": 133,
    "nginx": {
      "access": {
        "geoip": {
          "timezone": "America/New_York",
          "ip": "54.161.85.67",
          "latitude": 39.0481,
          "continent_code": "NA",
          "city_name": "Ashburn",
          "country_name": "United States",
          "country_code2": "US",
          "dma_code": 511,
          "country_code3": "US",
          "region_name": "Virginia",
          "location": {
            "lon": -77.4728,
            "lat": 39.0481
          },
          "postal_code": "20149",
          "region_code": "VA",
          "longitude": -77.4728
        }
      }
    },

You can see location is there.


(Mark Walkom) #2

That doesn't match, because the mapping is looking for geoip.location. You need to be explicit with the mappings in that regards, it doesn't just inherit it because it has geoip in the field name.


(Anthony Cleaves) #3

Hrmm, that shouldn't even be there. I will remove that, but the issue we are looking at is nginx.access


(Anthony Cleaves) #4

So I deleted the entire nginx data from elasticsearch using curl, deleted the index via kibana, removed that geoip mapping problem with error, readded the index and now it works.

Was that really the fix?


(Mark Walkom) #5

Maybe it was another, bad template?


(Anthony Cleaves) #6

I haven't reloaded the template, so that's still the same?

The only changes I made were mentioned.

Very odd, thanks for the assistance. I am finding mappings quite temperamental.


(Mark Walkom) #7

They aren't, they can just be a bit confusing. Even I had issues with them back when I started with Elasticsearch :slight_smile:

If you wanted to help improve things, it'd be really awesome if you created a github issue with a bit of an explainer and we can look at what we can do. Because sometimes it's hard to be objective when you've been working with a technology for 5+ years like some of us, so it's great to have a newer, fresher perspective :slight_smile:


(Anthony Cleaves) #8

Sorry, I meant to say my implementations are flakey.

I believe the problem is the index is being created before I load the template which would make sense.

I guess when I finally reach my production deployment, I will ensure i have no indexes when I load the template.

I think this is caused by my dev environment re-using the same hostnames. It's totally my fault


(Mark Walkom) #9

Yep:)


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.