I'm trying to use Filebeat to ship my logs from AWS CloudFront, but I can't figure out how to get the @timestamp field populated correctly. The log format is TSV (tab separated values) and the first two fields are date and time, like this:
2020-05-16 14:47:38 IAD89-C3 490 192.168.0.1 ...
I'm using dissect to separate the fields and this works fine:
- dissect:
tokenizer: "%{timestamp.date} %{timestamp.time} %{edge_location} %{response_bytes} %{clientip} %{method} %{distribution} %{uri_path} %{response} %{referer} %{user_agent} %{query} %{cookie} %{result_type} %{request_id} %{host_header} %{protocol} %{request_bytes} %{duration} %{x_forwarded_for} %{ssl_protocol} %{ssl_cipher} %{edge_response_result_type} %{cs_protocol_version} %{fle_status} %{fle_encrypted_fields} %{c_port} %{time_to_first_byte} %{x_edge_detail_result_type} %{sc_content_type} %{sc_content_len} %{sc_range_start} %{sc_range_end}"
field: "message"
target_prefix: "aws.cloudfront"
But the problem is that @timestamp is not populated, and I cannot use the timestamp processor because it requires the entire timestamp to be in one field, and I can't figure out how to concatenate fields in filebeat.
I understand it's possible to change the output format used by CloudFront but I would rather not do that, because these logs are also part of a data lake so they should have a consistent format past and future.