Now using filebeat and logstash sending nginx's json log on k8s.
The nginx's configuration likes
nginx.conf
http {
log_format bucket escape=json
'{'
'"request_id": "$request_id",'
'"method": "$request_method",'
'"status": "$status",'
'"forwarded_for": "$http_x_forwarded_for",'
'"host": "$host",'
'"url": "$request_uri",'
'"referer": "$http_referer",'
'"remote_ip": "$remote_addr",'
'"server_ip": "$server_addr",'
'"user_agent": "$http_user_agent",'
'}';
}
server {
access_log /var/log/nginx/access.json bucket;
}
Filebeat's configuration:
filebeat.yml
filebeat.shutdown_timeout: 5s
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/nginx/access.json*
exclude_files: ['\.gz$']
tags: ["access"]
processors:
- decode_json_fields:
fields: ["message"]
process_array: true
max_depth: 1
target: ""
overwrite_keys: true
add_error_key: false
output.logstash:
hosts: ["logstash.default.svc.cluster.local:5044"]
Here overwirte_keys
is true so it should overwrite metadata, right?
Logstash's configuration:
logstash.conf
input {
beats {
port => 5044
}
}
filter {
if "access" in [tags] {
mutate {
add_field => { "[@metadata][tags]" => "%{tags}" }
remove_field => [
"agent",
"event",
"service",
"log",
"input",
"fileset",
"ecs",
"container",
"kubernetes",
"@timestamp",
"@version",
"message",
"tags"
]
}
}
}
output {
if "access" in [@metadata][tags] {
google_cloud_storage {
bucket => "nginx_logs"
json_key_file => "/secrets/service_account/credentials.json"
temp_directory => "/tmp/nginx_logs"
log_file_prefix => "logstash_nginx_logs"
max_file_size_kbytes => 1024
output_format => "json"
date_pattern => "%Y-%m-%dT%H:00"
flush_interval_secs => 2
gzip => false
gzip_content_encoding => false
uploader_interval_secs => 60
include_uuid => true
include_hostname => true
}
}
}
It works well at the beginning. The log data has been generated to json files as:
{"user_agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 14_2_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 [FBAN/FBIOS;FBDV/iPhone13,1;FBMD/iPhone;FBSN/iOS;FBSV/14.2.1;FBSS/3;FBID/phone;FBLC/ja_JP;FBOP/5]","forwarded_for":"1.2.3.4","host":"api.mysite.com","method":"OPTIONS","request_id":"0127054b954fe4973852e1886130a6ca","referer":"https://www.world.com/","remote_ip":"2.3.4.5","server_ip":"3.4.5.6","status":"204","url":"/api/v1/post"}
But recently, this data occurred:
{"user_agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 14_2_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 [FBAN/FBIOS;FBDV/iPhone13,1;FBMD/iPhone;FBSN/iOS;FBSV/14.2.1;FBSS/3;FBID/phone;FBLC/ja_JP;FBOP/5]","forwarded_for":"1.2.3.4","host":"api.mysite.com","method":"OPTIONS","request_id":"0127054b954fe4973852e1886130a6ca","referer":"https://www.world.com/","remote_ip":"2.3.4.5","server_ip":"3.4.5.6","status":"204","url":"/api/v1/post"}
{"host":{"name":"filebeat-adio3"}}
{"host":{"name":"filebeat-adio3"}}
{"host":{"name":"filebeat-adio3"}}
This is not a regular data. It looks like filebeat server's host
metadata has been sent. But why? Is it a filebeat's mistake or logstash's?
Is there an another good way to filter this host data to ensure to be sent without conflict with fb/logstash's metadata?