After setup my Logging System and start shipping some Apache access logs from 3 servers to my ELK Server. In Kibana discover page I have some log records they have parsed values for response, request, OS, verb, and agent fields, but also i have log records from one of our server (A server) that doesn't show values for these fields. So by checking the message I found that we have two different Apache access log formats. Probably A server has load balancer in front of, therefore we have these two different formats.
Apache access log from A server not showing values in some fields:
Message samples of Server A and other servers
Normal Apache access log format:
168.125.158.50 - - [19/Mar/2018:19:58:25 +0000] "GET /favicon.ico HTTP/1.1" 404 506 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/30900801 Firefox/56.0"
Apache access message from Server A:
90.187.295.31 - - [20/Mar/2018:07:41:51 +0000] rt=189014 st=+ "GET /access.jsp?os_destination=%2Fbrowse%8FKJP-15006%8Fpage%3Dcom.servercode.servera-suite-utilities%253Atransitions-summary-tabpanel HTTP/1.1" 200 6220 "-" "Mozilla/5.0 (compatible; miladbot/1.2~bl; +http://www.milad.com/bot.html)\ "-" "64.200.171.62, 90.187.295.31"
Then I have this filter in my logstash pipeline - beats.conf file:
filter {
if "ApacheAccessLogs" in [tags] {
mutate {replace => {type => "apache-access"} }
grok {
match => [
"message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}",
"message" , "%{COMMONAPACHELOG}+%{GREEDYDATA:extra_fields}"
]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "apache-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "agent"
}
}
How can i unify all Apache access formats or implement specific filter for Apache access logs from server A? what do you think and please if you can help me with solution for that.