CloudFront log arrived as giant blob, how to make it searchable in Kibana?

jchiu · September 18, 2020, 7:42pm

Hello All,

Relatively new to the ELK world and I am having trouble making the information in the CloudFront log searchable in Kibana.

Let me explain in detail:

I was able to have CF (CloudFront) logs delivered to logstash instance and view them on Kibana.
Problem is: the "log data" arrives as a giant "message" blob (see below)

{"x_edge_result_type":"Miss","x_forwarded_for":"-","sc_content_len":"42109","@version":"1","time_to_first_byte":"1.039","x_edge_detailed_result_type":"Miss","cookies":"-","sc_range_end":"-","useragent":{"os":"Debian","name":"Other","build":"","os_name":"Debian","device":"Other"},"sc_bytes":43130,"@timestamp":"2020-09-12T23:05:11.000Z","c_ip":"111.222.333.444","cs_protocol_version":"HTTP/1.1","monthday":"12","type":"access-cf-repos","cs_host":"d111111111.cloudfront.net","fle_status":"-","geoip":{"region_code":"OR","postal_code":"97818","latitude":45,"continent_code":"NA","longitude":-119,"location":.{"lat":45,"lon":-119},"country_code3":"US","dma_code":810,"ip":"111.222.333.444","city_name":"Boardman","country_code2":"US","region_name":"Oregon","timezone":"America/Los_Angeles","country_name":"United States"},"referrer":"-","month":"09","x_edge_request_id":"CgmMEnPPv0NNcIEijDOWVBcLJmvp","cs_protocol":"https","fle_encrypted_fields":"-","time_taken":1.04,"c_port":26985,"x_edge_response_result_type":"Miss","cs_uri_query":"-","year":"2020","sc_content_type":"application/octet-stream","cs_bytes":1265,"x_host_header":"xxx.xxx.net","ssl_protocol":"TLSv1.2","sc_range_start":"-","cs_method":"GET","sc_status":200,"x_edge_location":"HIO51-C1","ssl_cipher":"ECDHE-SHA256","cs_uri_stem":"/dists/xenial/InRelease"}

This is how it looks like in Kibana:

Which is not ideal since I won't be able to specify the "fields" that meet my criteria.
For example, If I want to search for messages matches the following criteria:

"type":"access-cf-repos"
"c_ip":"111.222.333.444"
"sc_status":200

I will have to use the search bar with lucene query syntax (which is ugly) because those criteria are not listed as "Available Fields".

If I understand it correctly, this should be related to how the logstash config file is structured.
Here is a snippet of my logstash config file:

    input {
      s3 {
        "bucket" => "dist-log"
        "prefix" => "distribution-log/"
        "type" => "access-cf-repos"
        "region" => "us-west-2"
        "interval" => "60"
        "delete" => "true"
      }
    }

    filter {
      grok {
        match => { "message" => "(?<date>%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:monthday})\t%{TIME:time}\t(?<x_edge_location>[\w\-]+)\t(?:%{NUMBER:sc_bytes:int}|-)\t%{IPORHOST:c_ip}\t%{WORD:cs_method}\t%{HOSTNAME:cs_host}\t%{NOTSPACE:cs_uri_stem}\t%{NUMBER:sc_status:int}\t%{NOTSPACE:referrer}\t%{NOTSPACE:User_Agent}\t%{NOTSPACE:cs_uri_query}\t%{NOTSPACE:cookies}\t%{WORD:x_edge_result_type}\t%{NOTSPACE:x_edge_request_id}\t%{HOSTNAME:x_host_header}\t%{WORD:cs_protocol}\t%{NUMBER:cs_bytes:int}\t%{NUMBER:time_taken:float}\t%{NOTSPACE:x_forwarded_for}\t%{NOTSPACE:ssl_protocol}\t%{NOTSPACE:ssl_cipher}\t%{WORD:x_edge_response_result_type}\t%{NOTSPACE:cs_protocol_version}\t%{NOTSPACE:fle_status}\t%{NOTSPACE:fle_encrypted_fields}\t%{NUMBER:c_port:int}\t%{NOTSPACE:time_to_first_byte}\t%{WORD:x_edge_detailed_result_type}\t%{NOTSPACE:sc_content_type}\t%{NOTSPACE:sc_content_len}\t%{NOTSPACE:sc_range_start}\t%{NOTSPACE:sc_range_end}" }
    }

    mutate {
        add_field => [ "listener_timestamp", "%{date} %{time}" ]
    }

    date {
        match => [ "listener_timestamp", "yyyy-MM-dd HH:mm:ss" ]
        target => "@timestamp"
    }

    geoip {
        source => "c_ip"
    }

    useragent {
        source => "User_Agent"
        target => "useragent"
    }

    mutate {
        remove_field => ["date", "time", "listener_timestamp", "cloudfront_version", "message", "cloudfront_fields", "User_Agent"]
     }
    }

Is there a better way to breakdown all the fields inside that giant message blob?
Perhaps there are plugins that I can use or some examples that I can follow?

Thank you in advance!

jchiu · September 23, 2020, 12:15am

It is related to the bad mutate syntax I had in the config file...
should have noticed that before posting the question here.

system · October 21, 2020, 12:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash configuration for Cloudfront logs Logstash	8	3307	July 6, 2017
Turn messages into fields - querying in kibana Logstash	8	1204	July 6, 2017
AWS Cloudfront Kibana	3	428	December 30, 2020
Kibana: Creating Selected Fields Logstash	8	1956	May 16, 2017
Kibana cannot search the string that input in Discover page Kibana	2	1041	September 4, 2017

CloudFront log arrived as giant blob, how to make it searchable in Kibana?

Related topics