Filebeat Version: 8.16.2
We have currently configured filebeat to read logs from a custom location and enrich it with kubernetes metadata
Repro:
- Create log files that would be pushed out
- Start filebeat (Start after log files already exists)
- Filebeat starts harvesting file and in parallel gets the kubernetes metadata and creates index for pod
- All the logs that are pushed out before API response is received results in data without metadata
In our ELK setup, we rely on Kubernetes metadata to redirect to appropriate pipeline, due to above issue we miss few log lines
Is there a way to delay sending event (not delay in harvesting, as that might result in missed logs if the log file is wiped out) until a tag is available, if not then send it as it is?
This would always be the case, when a new POD is created and the filebeat does not yet cache metadata, it will send without enriching the event due to race condition
filebeat.yml
filebeat.inputs:
- type: filestream
enabled: true
id: logs
paths:
- /var/log/app_logs/*/*.json # Custom log path
parsers:
- ndjson: # Parse JSON logs
target: "" # Merge JSON fields into the root
add_error_key: true # Add an error field if parsing fails
symlinks: true # Follow symlinks to log files
processors:
- add_kubernetes_metadata:
host: ${NODE_NAME} # Use the Kubernetes node name
default_indexers.enabled: false # Disable default indexers
default_matchers.enabled: false # Disable default matchers
indexers:
- ip_port: {} # Use the IP:port indexer
matchers:
- fields:
# Match logs based on the server IP field
lookup_fields: ["serverIP"]
# Output to Logstash
output.logstash:
hosts: ["logstash:5050"]
logging.level: debug # Enable debug logging for troubleshooting
logging.selectors: ["add_kubernetes_metadata", "kubernetes"]