Trouble with GeoIP mapping form logstash to elastic search

Brand new to this product, I have a nginx server send logs via filebeats, to logstash, which is working. I have the filter geoip which is also working as far as i can tell except that it appears that I'm missing something that turns geoip.location into a geopoint. I've determined that the geoip information is being populated in logstash but not being mapped in elasticsearch. Here is my logstatsh config:

input {
beats { port => 5044 }
}

filter {
grok {
patterns_dir => ["/etc/logstash/patterns"]
match => { "message" => "%{SED_NGINX_COMBINE}" }
}
geoip {
source => "clientip"
target => "geoip.location"
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}

}

output {
elasticsearch {
hosts => "localhost:9200"
manage_template => false
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}

Now I'm also using a custom log pattern, Combined + x-forwarded-for header here is that:
SED_NGINX_COMBINE %{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QUOTEDSTRING:xforwardedfor_header}

I've been reading that: "the default Elasticsearch template provided with the elasticsearch output maps the [geoip][location] field to an Elasticsearch geo_point." Though I'm not sure how to test for this or how to actually do it.

While troubleshooting I came across this command "curl -XGET localhost:9200/nginx-*/_mapping " to see the mapping layout and in doing so notice that the geo_point type wasn't listed in the mapping. So the geoip information is being populated in logstash but not being mapped in elasticsearch. Just don't know how to do this. Not very familiar with json.

Thanks

Hi @cisaksen ,

i did it everytime in this way:

  • upload logs
  • take the mappings that ES made
  • change field types etc
  • create a index template with it that will apply to your index

also dont forget that changes on the mappings wont apply, for that you need to rebuild the index or delete and recreate it.

you can also use Kopf, its still working with ES 5.1.2 for me

and i also adding a field for it and change the type later in my index template.

        geoip {
                source => "client_ip"
                target => "geoip"
                add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
                add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
                }

i think you can also change the field type in logstash, but i never done this before

filter {
  mutate {
    convert => { "fieldname" => "integer" }
  }
}

sorry but this is still not working. I still cannot get kibana to identify a geo_point

error: No Compatible Fields: The "nginx-*" index pattern does not contain any of the following field types: geo_point

nginx-* is coming for the file beats configuration on the web server sending the logs.

did you do a rebuild or created a new index with the new field?

if you do curl -XGET localhost:9200/nginx-2017.02.22/_mapping then you see that your location field is in type long, so if you change the mapping, then you need to do a rebuild of the index.

but if you using nginx-* with a Date, then it should create every day a new index based on the date and it will apply the change.

like:
haproxy-2017.01.25
haproxy-2017.01.26
haproxy-2017.01.27

each of them will apply a new mapping, you can overwrite this if you place a index template.

when you did the mutate in logstash, check today again with the new created index.

try curl -XGET localhost:9200/nginx-2017.02.23/_mapping

Still wrapping my head around on how all this works together.

So in filebeats I configured it like this:
output.logstash:

The Logstash hosts

enabled: true
hosts: ["x.x.x.x:5044"]

index: "nginx"

My understanding is that logstash sees this index-tag and adds the date to it, then passes it to elasticsearch as the index - yes no maybe close ?

Anyway - how do i rebuild a index that i didn't actually create ? and if i do delete the existing index in ES does that delete the log entries that were added under that index ?
If I need to create a index template - do I create a file somewhere with it's definition or is it all done interactively ?

Maybe there's a better question here: is there a better way to pass web logs to logstash other than filebeats ? I read that using a NFS share can cause problems which is why I choose filebeats. Just running this on a staging server at the moment but our production environment has 3 web servers.

Another question: Is there hardware specs anywhere to base the size of disk, ram and cpu ?

Sorry one more: If I want to pass logs from different systems, like varnish & nginx do i create 2 conf.d/*.conf files or does everything reside in one ?

Sorry for all the questions.

Hi again :slight_smile:

My understanding is that logstash sees this index-tag and adds the date to it, then passes it to elasticsearch as the index - yes no maybe close?

you're creating the index with your ES output in Logstash

output {
elasticsearch {
hosts => "localhost:9200"
manage_template => false
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" <- here
document_type => "%{[@metadata][type]}"
}
}

Anyway - how do i rebuild an index that i didn't actually create?

Here

and if i do delete the existing index in ES does that delete the log entries that were added under that index?

if you're using filebeat, then filebeat is creating a registry file to save the latest logs, if you still got the logs from the days before, remove the registry file and let it collect all logs again.

If I need to create a index template - do I create a file somewhere with it's definition or is it all done interactively ?

think filebeat can also load templates, but you can also create them on the ES API.
Index Template

the easy way would be to grab the current mapping, and edit the needed fields and save it as index template.

Maybe there's a better question here: is there a better way to pass web logs to logstash other than filebeats ? I read that using a NFS share can cause problems which is why I choose filebeats. Just running this on a staging server at the moment but our production environment has 3 web servers.

I think filebeat is fine with the options you got, but you can also use rsyslog

Another question: Is there hardware specs anywhere to base the size of disk, ram and cpu ?

from what I'm seeing on my severs is that filebeat is almost needing nothing, logstash is needing more based on your logs sending it.

Sorry one more: If I want to pass logs from different systems, like varnish & nginx do i create 2 conf.d/*.conf files or does everything reside in one ?

I'm using 1 logstash config for 5 different logs.

example:

input {
    beats {
            port => "5044"
        }

	sqs {
            type => "smtp"
            queue => "xxx"
            region => "xxx"
            aws_credentials_file => "xxx"
	}
    s3 {
            type => "whitelabel"
            bucket => "xxx"
            region => "xxx"
            aws_credentials_file => "xxx"
    }
}

with "type" you can add a tag to the inputs to separate it later in the Filter and Output.

Filebeat can do it in the input:

filebeat.prospectors:

- input_type: log
  document_type: nginx_log
  paths:
    - /var/log/nginx.log

document_type is the same like type in logstash

so we can do this in the logstahs filter:

filter {
	if [type] == "whitelabel" {
                date {
                        match => [ "accept_date", "dd/MMM/yyyy:HH:mm:ss.SSS"]
                        }

                geoip {
                        source => "client_ip"
                        target => "geoip"
                        add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
                        add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
                        }

                mutate {
                        convert => [ "[geoip][coordinates]", "float"]
                        }
        }

and in the Output too:

output {
	if [type] == "nginx_log" {
		#Nginx
		elasticsearch{
		hosts => ["https://www.example.com:9999"]
		user => ["test"]
		password => ["test"]
		index => "nginx-%{+YYYY.MM.dd}"
		}
	}
	else if [type] == "smtp" {
		#SMTP
		elasticsearch{
		hosts => ["https://www.example.com:9999"]
		user => ["test"]
		password => ["test"]
		index => "smtp-%{+YYYY.MM.dd}"
		}
	}
    else if [type] == "whitelabel" {
        #Whitelabel
        elasticsearch{
        hosts => ["https://www.example.com:9999"]
        user => ["test"]
        password => ["test"]
        index => "whitelabel-%{+YYYY.MM.dd}"
        }
    }
}

hope that helps

Ok I discovered where my lack of understanding came into play.. I found the default mapping definition and saw that ES will only use it if the index name start with logstash.

Any other name with need a custom mapping template.

Also you didn't need the
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}

As the default mapping uses the geoip.location which is the same as you have above, but it maps it to a geo_point type.

Thanks for your help.

You need to copy the template that LS uses and then adapt that to the index pattern you have, ie nginx-*.
Or, use an index name of logstash-nginx- so that the default one applies.

That's the default anyway :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.