We are running into an issue where Elasticsearch tokenizer is splitting the hostname at the '-' the standard for host name convention where I work is
<app>-<location>-id
so instead of elasticsearch-ny-001 we end up with "elasticsearch", "ny", "001" I have searched everywhere for an implementation of this even in this forum where a solution involving replacing the - with a . was proposed but for reporting we need to maintain the full hostname as it is. I have tried the following template but it did not have the desired outcome.
curl -XPUT http://localhost:9200/_template/logstash -d '
{
"template": "*logstash*",
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"file" : {
"properties" : {
"host" : {
"type" : "string",
"index" : "not_analyzed"
},
"environment" : {
"type" : "string",
"index" : "not_analyzed"
},
"path" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
}
}
I am aware that logstash sets the tokenizer by default, but we are using fluentd to publish the data and not logstash and I can't find anywhere in the logstash github repo where the tokenizer is set.
Any help would be greatly appreciated.