Hi All,
So I have the following GROK filter going for my Apache logs.
else if "apache" in [tags] {
grok {
match => { "message" => '%{COMBINEDAPACHELOG}
}
This is working for ~75% of ingested logs but I seem to have some Apache log files that have slightly different formats. Some contain a port number next to hostname (hostname:8080) and some have random characters sprinkled in for some reason ('/'). Examples below with the odd message being the first and the characters causing grok failures in BOLD .
10.101.76.157 10.101.76.157 vdpswebdev01.qualcomm.com**:8030** - [10/Oct/2018:11:45:00 -0700] **\**"GET /solr-chipcode/collection1/replication?command=indexversion&wt=javabin&qt=%2Freplication&version=2 HTTP/1.1\" 200 76 **\**"-**\**" **\**"Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0**\**"
10.101.76.147 10.101.76.147 vdpswebdev01.qualcomm.com - [09/Oct/2018:13:45:55 -0700] "GET /solr-cpip-global/cpip-directory-path/replication?command=indexversion&wt=javabin&qt=%2Freplication&version=2 HTTP/1.1" 200 80 "-" "Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0"
172.30.42.114, 10.49.16.6 10.49.16.6 prismsearch-dev.qualcomm.com - [09/Oct/2018:13:52:09 -0700] "GET /solr-prism/collection1/select?q=*%3A*&wt=json&indent=true HTTP/1.1" 200 4200 "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"
My question is do I need to create customized filters to capture every log line variation or is there a more sane approach?
Cheers!