Hostname Tokenizer


(Telmo X) #1

We are running into an issue where Elasticsearch tokenizer is splitting the hostname at the '-' the standard for host name convention where I work is

<app>-<location>-id 

so instead of elasticsearch-ny-001 we end up with "elasticsearch", "ny", "001" I have searched everywhere for an implementation of this even in this forum where a solution involving replacing the - with a . was proposed but for reporting we need to maintain the full hostname as it is. I have tried the following template but it did not have the desired outcome.

curl -XPUT http://localhost:9200/_template/logstash -d '
{
    "template": "*logstash*",
    "settings" : {
            "number_of_shards" : 1
    },
    "mappings" :  {
            "file" : {
                    "properties" : {
                            "host" : {
                                    "type" : "string",
                                    "index" : "not_analyzed"
                            },
                            "environment" : {
                                    "type" : "string",
                                    "index" : "not_analyzed"
                            },
                            "path" : {
                                    "type" : "string",
                                    "index" : "not_analyzed"
                            }
                    }
            }
    }
}

I am aware that logstash sets the tokenizer by default, but we are using fluentd to publish the data and not logstash and I can't find anywhere in the logstash github repo where the tokenizer is set.

Any help would be greatly appreciated.


(Mark Walkom) #2

Is fluentd using the logstash index pattern?
Cause if so that should definitely work.


(system) #3