How do I get proper mapping for haproxy grok


(Sam Saffron) #1

I have a rather simple logstash setup.

Our haproxy server ships logs via lumberjack to the logstash/kibana/elastic server.

The configration is pretty trivial:

input {
  lumberjack {
    port => 5150
    type => "logs"
    ssl_certificate => "/etc/logstash/tls/logstash-forwarder.crt"
    ssl_key => "/etc/logstash/tls/logstash-forwarder.key"
  }
}
filter {
    grok {
      match => ["message", "%{HAPROXYHTTP}"]
    }
}
output {
  elasticsearch { host => localhost }
}

All of this seems pretty straight forward, but in kibana all these fields are being mapped strings. What am I doing wrong? How can I get automatic mapping working correctly for all my integers (and ip), do I need to add mutate filters? Do I need to post mappings?

This is what was automatically detected:

You can see ports / hours / times are all being detected as strings.

(I have not issue with erasing all my current data)


(Mark Walkom) #2

You really need to create a mapping for this to be what you expect. ES does it's best but it plays really safe and maps things to string accordingly.


(Sam Saffron) #3

Been struggling with this for about 3 hours now, can not find a good "step by step" on how to create mappings, I tried this:

But I just get 400s after it

#!/bin/sh
curl -XPUT http://localhost:9200/_template/logstash_per_index -d '
{
  "template" : "logstash-*",
  "settings" : {
    "index.refresh_interval" : "5s"
  },
  "mappings" : {
    "_default_" : {
       "_all" : {"enabled" : true, "omit_norms" : true},
       "dynamic_templates" : [ {
         "message_field" : {
           "match" : "message",
           "match_mapping_type" : "string",
           "mapping" : {
             "type" : "string", "index" : "analyzed", "omit_norms" : true
           }
         }
       }, {
         "string_fields" : {
           "match" : "*",
           "match_mapping_type" : "string",
           "mapping" : {
             "type" : "string", "index" : "analyzed", "omit_norms" : true,
               "fields" : {
                 "raw" : {"type": "string", "index" : "not_analyzed", "doc_values" : true, "ignore_above" : 256}
               }
           }
         }
       }, {
         "float_fields" : {
           "match" : "*",
           "match_mapping_type" : "float",
           "mapping" : { "type" : "float", "doc_values" : true }
         }
       }, {
         "double_fields" : {
           "match" : "*",
           "match_mapping_type" : "double",
           "mapping" : { "type" : "double", "doc_values" : true }
         }
       }, {
         "byte_fields" : {
           "match" : "*",
           "match_mapping_type" : "byte",
           "mapping" : { "type" : "byte", "doc_values" : true }
         }
       }, {
         "short_fields" : {
           "match" : "*",
           "match_mapping_type" : "short",
           "mapping" : { "type" : "short", "doc_values" : true }
         }
       }, {
         "integer_fields" : {
           "match" : "*",
           "match_mapping_type" : "integer",
           "mapping" : { "type" : "integer", "doc_values" : true }
         }
       }, {
         "long_fields" : {
           "match" : "*",
           "match_mapping_type" : "long",
           "mapping" : { "type" : "long", "doc_values" : true }
         }
       }, {
         "date_fields" : {
           "match" : "*",
           "match_mapping_type" : "date",
           "mapping" : { "type" : "date", "doc_values" : true }
         }
       } ],
       "properties" : {
         "@timestamp": { "type": "date", "doc_values" : true },
         "@version": { "type": "string", "index": "not_analyzed", "doc_values" : true },
         "client_ip": { "type": "ip", "doc_values" : true },
         "client_port": {"type": "integer"},
         "geoip"  : {
           "type" : "object",
           "dynamic": true,
           "properties" : {
             "ip": { "type": "ip", "doc_values" : true },
             "location" : { "type" : "geo_point", "doc_values" : true },
             "latitude" : { "type" : "float", "doc_values" : true },
             "longitude" : { "type" : "float", "doc_values" : true }
           }
         }
       }, 
       "haproxy-access" : {
          "time_backend_connect": {"type": "integer"},
          "time_backend_response": {"type": "integer"},
          "time_duration": {"type": "integer"},
          "bytes_read": {"type": "integer"},
          "client_ip": {"type": "ip"},
          "client_port": {"type": "integer"},
          "http_status_code": {"type": "integer"}
	}
    }  
  }
}'

What is a minimal document I need to post at Elastic just to get it to map 1 field?


(Mark Walkom) #4

That looks ok from what I can tell, can you provide more info on the error you are getting?


(Sam Saffron) #5

I appear to be getting

[2015-09-01 06:37:24,665][DEBUG][action.admin.indices.create] [Death-Stalker] [logstash-2015.09.01] failed to create
org.elasticsearch.index.mapper.MapperParsingException: mapping [_default_]
	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:382)
	at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:374)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:196)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:162)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: Root type mapping not empty after parsing! Remaining fields:   [haproxy-access : {client_port={type=integer}, @timestamp={type=date, doc_values=true}, time_backend_response={type=integer}, http_status_code={type=integer}, time_duration={type=integer}, bytes_read={type=integer}, client_ip={type=ip}, time_backend_connect={type=integer}}]
	at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:278)
	at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:192)
	at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:177)
	at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:294)
	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:379)
	... 6 more

(Mark Walkom) #6

Try this - https://gist.github.com/markwalkom/78d56b5a5684d765642a


(Sam Saffron) #7

Thanks, I got something similar to this working!


(Magnus Bäck) #8

If only the HAPROXYHTTP grok pattern was configured to emit integer fields you wouldn't have to do this. I can't think of a single reason why its %{INT:foo} tokens aren't %{INT:foo:int} instead. I've filed github.com/logstash-plugins/logstash-patterns-core issue #83 to get this corrected.


(system) #9