Currently in the process of upgrading our Elastic setup to 6.x and I am a bit lost with the intent of some changes made.
In the past I had set up Logstash to accept apache log files via filebeat as well as a manual spooling process via the input plugin. With the input plugin I had things like:
input {
file {
path => "/path/to/site_access.log"
start_position => "beginning"
"[@metadata][type]" => "apache"
}
}
filter {
if [@metadata][type] == "apache" {
mutate {
replace => {
"host" => "indexer01"
}
}
}
}
And then the apache filter:
filter {
if [@metadata][type] == "apache" {
grok {
match => { "message" => "\A\[%{HTTPDATE:accept_date}\] %{IP:client_ip} %{NOTSPACE:tls_protocol} %{NOTSPACE:cipher} \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version})?|%{DATA:raw_request})\" %{NUMBER:status_code:int} %{NOTSPACE:bytes:int} %{NUMBER:duration:int} %{QS:referrer} %{QS:agent}" }
}
date {
match => [ "accept_date", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
mutate {
gsub => [
"agent", "\"", "",
"referrer", "\"", ""
]
}
ruby {
code => "
begin
event.set('bytes', nil) if event.get('bytes') == '-'
event.set('duration', nil) if event.get('duration') == '-'
event.set('agent', nil) if event.get('agent') == '-'
event.set('referrer', nil) if event.get('referrer') == '-'
end
"
}
}
}
Followed by elasticsearch output:
output {
elasticsearch {
hosts => ["http://storage01:9200"]
index => "%{[@metadata][type]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}
And on Elasticsearch I have a template:
{
"index_patterns": ["apache-*"],
"mappings": {
"apache": {
"properties": {
"accept_date": {
"type": "date",
"format": "dd/MMM/yyyy:HH:mm:ss Z||strict_date_optional_time||epoch_millis"
},
"agent": {
"type": "keyword"
},
"bytes": {
"type": "long"
},
"cipher": {
"type": "keyword"
},
"client_ip": {
"type": "ip"
},
"duration": {
"type": "long"
},
"host": {
"type": "keyword"
},
"http_version": {
"type": "keyword"
},
"referrer": {
"type": "text"
},
"request": {
"type": "text"
},
"status_code": {
"type": "keyword"
},
"tls_protocol": {
"type": "keyword"
},
"verb": {
"type": "keyword"
}
}
}
}
}
This will result in data in Elasticsearch with _type
set to apache.
Now, the moment I remove document_type
, since it's deprecated, the elasticsearch output plugin will set _type
to doc
(as documented on document_type). This will result in an error: Rejecting mapping update to [apache-2018.02.08] as the final mapping would have more than 1 type: [apache, doc]
What is the intended way forward? I can work around it by specifying the mapping name in the apache template as doc
, but that seems a bit silly. I can also keep document_type
around to set it to apache, which will then match the template's mapping name, but that would sort of defeat the whole purpose of removing deprecated things.