Hai, I'm new to Elastic Stack and I'm an intern to a company. One of the managers give me this project and I need you guys help. The use-case here is they want to categorize and analyse old logs from a variety of devices, OS and firewalls like apache, windows event logs, iptables and so on. So, I want to parse apache logs as they are one of the easiest to get by. I experiment a lot but with little progress (yes, I do experiment with filebeat modules but for me, it doesn't quite fit with the use-case given). I'm not that experience with all of this and totally out of my job scope. I'm almost out of my wits here and these guys never touch these so I relying 100% on the Internet to power through these problem. I didn't see any problem with my grok pattern and i tested it out with grok debugger.
p/s: Due to my frustration and rage, I delete all the index and out inside Kibana so i don't have the output but essentially any of the pattern doesn't show and the message didn't get catergorize
Filebeat.yml
filebeat.inputs:
- type: log
enabled: false
paths:
- "/home/elk-kpmg/Desktop/logs/apache2/access/*"
tags: ["apache","apache_access"]
- type: log
enabled: false
paths:
- "/home/elk-kpmg/Desktop/logs/apache2/error/*"
tags: ["apache","apache_error"]
#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
# Array of hosts to connect to.
#hosts: ["localhost:9200"]
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
Logstash.yml
filter
{
if [tags] == "apache"
{
if [tags] == "apache_access"
{
grok
{
match => {"message" => ["%{IPORHOST:Client_IP}%{APACHEINFO}\[% {APACHEDATE:Timestamp}\]\s*%{APACHEDATA}","%{IPORHOST:Client_IP}%{APACHEINFO}\[%{APACHEDATE:Timestamp}\]\s*%{APACHEDATA}\s*%{APACHERADATA}"]}
patterns_dir => ["/etc/logstash/patterns"]
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
tag_on_failure => ["_grokparsefailure"]
}
if ["_grokparsefailure"] in [tags]
{
drop{}
}
date
{
match => [ "Timestamp", "dd/MMM/YYYY:H:m:s Z" ]
target => "Timestamp"
}
useragent
{
source => "Agent"
target => "User_Agent"
remove_field => "Agent"
}
geoip
{
source => "Client_IP"
target => "IP_Address"
}
}
}
}
the custom patterns are these;
APACHEINFO (\s*(?:%{USER:Identification}|-)\s*(?:%{USER:Username}|-)\s*)
APACHEDATE %{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{DATA:Timezone}
APACHEDATA "(?:%{WORD:Method} %{NOTSPACE:Request}(?: HTTP/%{NUMBER:Http_Version})?|%{DATA:Raw_Request})" %{NUMBER:Response_Code} (?:%{NUMBER:Bytes}|-)
APACHERADATA (?:"%{DATA:Referrer}" "%{DATA:Agent}")