Below are 2 logs of same cloudfront distribution, patterns of logs same but when processing with logstash same filter showing tag _grokparsefailure for first line but no any failure for second line, Please help to remove this _grokparsefailure :
2020-07-09 19:07:27 DXB50-C1 14487 37.39.145.41 GET test.cloudfront.net /article/news-detail/335752 200 https://m.example.com/ Mozilla/5.0%20(iPhone;%20CPU%20iPhone%20OS%2013_3_1%20like%20Mac%20OS%20X)%20AppleWebKit/605.1.15%20(KHTML,%20like%20Gecko)%20Version/13.0.5%20Mobile/15E148%20Safari/604.1 - - Miss J44p0LRvtli5kAZbDFQxG5hE6oLAweqS8NU330Z4P6rU_3VPp9nyfg== m.example.com https 656 0.180 - TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss HTTP/2.0 - - 653640.180 Miss text/html;%20charset=UTF-8 - - -
2020-07-09 17:52:09 ARN1-C1 4651 178.130.87.85 GET test.cloudfront.net /partials/category/310929/kottayam 200 https://m.example.com/category/310929/kottayam Mozilla/5.0%20(Linux;%20Android%205.1.1;%20SM-J320P)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/83.0.4103.106%20Mobile%20Safari/537.36 - - Miss xSU39G7SA_R-EZg9vvCzAToW5--Y45dg8Fw49W-J51_1mb6XPkK6MA== m.example.com https 93 0.994 - TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss HTTP/2.0 - - 120170.994 Miss text/html;%20charset=UTF-8 - - -
My logstash conf is as follows :
input {
s3 {
access_key_id => "xxxxxxxxxxxxxxxxxx"
secret_access_key => "xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
region => "ap-south-1"
bucket => "logs.test.com"
type => "example"
codec => "plain"
prefix => "cf-example/"
sincedb_path => "/ebs/sincedb/.cf-example"
}
}
filter {
grok {
match => { "message" => "%{DATE_EU:date}\t%{TIME:time}\t(?<x_edge_location>\b[\w\-]+\b)\t(?:%{NUMBER:sc_bytes:int}|-)\t%{IPORHOST:c_ip}\t%{WORD:cs_method}\t%{HOSTNAME:cs_host}\t%{NOTSPACE:request_uri}\t%{NUMBER:elb_status:int}\t%{GREEDYDATA:referrer}\t%{GREEDYDATA:User_Agent}\t%{GREEDYDATA:http_uri}\t%{GREEDYDATA:cookies}\t%{WORD:x_edge_result_type}\t%{NOTSPACE:x_edge_request_id}\t%{HOSTNAME:x_host_header}\t%{URIPROTO:cs_protocol}\t%{INT:cs_bytes:int}\t%{GREEDYDATA:time_taken}\t%{GREEDYDATA:x_forwarded_for}\t%{GREEDYDATA:ssl_protocol}\t%{GREEDYDATA:ssl_cipher}\t%{GREEDYDATA:x_edge_response_result_type}" }
}
grok {
match => [ "request_uri", "/%{WORD:uri_param_1}/%{WORD:uri_param_2}/%{GREEDYDATA:other_params}?" ]
}
mutate {
add_field => [ "listener_timestamp", "%{date} %{time}" ]
}
date {
match => [ "listener_timestamp", "yy-MM-dd HH:mm:ss" ]
target => "@timestamp"
}
geoip {
source => "c_ip"
}
useragent {
source => "User_Agent"
target => "useragent"
}
mutate {
remove_field => ["date", "time", "listener_timestamp", "cloudfront_version", "message", "cloudfront_fields", "User_Agent"]
}
}
output {
elasticsearch {
index => "example-cf-logs-%{+YYYY.MM.dd}"
hosts => "127.0.0.1:9200"
template => "/ebs/tar/cf-example/cloudfront.template.json"
}
}
Please help to fix this.