Logstash AWS GeoIP Issue?

Hello,

We're running an ELK cluster on AWS. We also run a single box with all of ELK on it to verify changes locally before pushing to the cloud.

We're running into an issue where the local box can run the GeoIP2-City.mmdb database and pull all of the information out correctly. When we try to do the same thing on AWS we're getting errors. Specifically, here's what we're seeing:

Unknown error while looking up GeoIP data {:exception=>#<NoMethodError: undefined method `' for nil:NilClass>, :field=>nil, :event=>#<LogStash::Event:0x47f3203f @accessors=#<LogStash::Util::Accessors:0x2aca2d00 @store={"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, @lut={"host"=>[{"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, "host"], "type"=>[{"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, "type"], "[type]"=>[{"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, "type"], "message"=>[{"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, "message"], "tags"=>[{"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, "tags"], "client_ip"=>[{"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, "client_ip"]}>, @data={"message"=>"2015-09-14T15:19:34.328933Z clusterID 66.109.145.98:56225 10.0.1.128:6443 0.000047 0.09505 0.000035 200 200 0 6527 GET https://clusterInfo", "@version"=>"1", "@timestamp"=>"2015-09-14T17:44:28.805Z", "host"=>"10.0.0.102:55794", "type"=>"ELB", "tags"=>["_grokparsefailure"]}, @cancelled=false>, :level=>:error}

We have the exact same db, OS version, ELK versions, and logstash.conf (except where output points) file running on the two ELK platforms - local and cloud. We also stood up another ELK box on AWS to see if having all of ELK on one box somehow prevented the issue, but alas it's also having the same problem with Logstash as the other AWS device.

Here's our logstash.conf file:
Has to be attached in second post.

Any ideas on how to fix this?

As mentioned in the first post here's a copy of our logstash.conf file.

input {
tcp{
port => 9292
}
tcp{
port => 9230
type => ELB

    }
    tcp{
            port => 9240
            type => ArcGIS
    }
    udp{
            port => 9292
    }

}

filter{
if [type] == "ELB"{
grok{
match => {"message" => "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} %{IP:backend_ip}:%{NUMBER:backend_port:int} %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} %{NUMBER:elb_status_code:int} %{NUMBER:backend_status_code:int} %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} %{WORD:verb} %{NOTSPACE:request} HTTP/%{NUMBER:httpversion} %{NOTSPACE:user_agent} %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol}"
}
}
geoip {
source => "client_ip"
target => "geoip"
database => "/etc/logstash/GeoIP2-City.mmdb"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}
}

if [type] == "ArcGIS"{
    grok{
        match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:LogType} %{NUMBER:code} %{NOTSPACE:target} %{NOTSPACE:machine} %{NUMBER:process} %{NUMBER:thread} %{NOTSPACE:user} %{NUMBER:XMin} %{NUMBER:YMin} %{NUMBER:XMax} %{NUMBER:YMax} %{NUMBER:XCent} %{NUMBER:YCent} %{NUMBER:SizeOne} %{NUMBER:SizeTwo} %{NUMBER:Scale}"
        }
    }
    if [XCent] and [YCent] {
        mutate {
            add_field => [ "[userMapXY]", "%{XCent}" ]
            add_field => [ "[userMapXY]", "%{YCent}" ]
        }
        mutate {
            convert => [ "[userMapXY]", "float" ]
        }
    }
}

}

output {
elasticsearch {
host => "localhost:9200"
protocol => "http"
}
}

The records that failed have a "_grokparsefailure" tag set, which indicates that the grok parsing failed and that the client_ip field, which the geoip filter relies on, therefore will was be extracted. Are you feeding different data in the two systems?

The data is fed the exact same way in the two systems.

Also by changing the DB from a dot mmdb to the dot dat within the logstash config AWS will happily parse all of the data.

Is that the latest release of the Maxmind DB? Cause we only support the original, but now deprecated version, not the newest release.

Please comment on this GH issue if you'd like to see support for the newer version of the database.

Yes, that's the proprietary version of the db. What's interesting is that it works just fine on a local machine, but won't work on AWS. Is there some difference with running logstash on the cloud that would cause that to be an issue?

What version of LS?

Also is it the proprietary one, or the newer version of the "free" one?

Logstash 1.4.2

This is a copy of the pay for version of the max mind database.