COMBINEDAPACHELOG Outliers

Hi All,

So I have the following GROK filter going for my Apache logs.

else if "apache" in [tags] {
    grok {
      match => { "message" => '%{COMBINEDAPACHELOG}
    }

This is working for ~75% of ingested logs but I seem to have some Apache log files that have slightly different formats. Some contain a port number next to hostname (hostname:8080) and some have random characters sprinkled in for some reason ('/'). Examples below with the odd message being the first and the characters causing grok failures in BOLD .

10.101.76.157 10.101.76.157 vdpswebdev01.qualcomm.com**:8030** - [10/Oct/2018:11:45:00 -0700] **\**"GET /solr-chipcode/collection1/replication?command=indexversion&wt=javabin&qt=%2Freplication&version=2 HTTP/1.1\" 200 76 **\**"-**\**" **\**"Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0**\**"

10.101.76.147 10.101.76.147 vdpswebdev01.qualcomm.com - [09/Oct/2018:13:45:55 -0700] "GET /solr-cpip-global/cpip-directory-path/replication?command=indexversion&wt=javabin&qt=%2Freplication&version=2 HTTP/1.1" 200 80 "-" "Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0"

172.30.42.114, 10.49.16.6 10.49.16.6 prismsearch-dev.qualcomm.com - [09/Oct/2018:13:52:09 -0700] "GET /solr-prism/collection1/select?q=*%3A*&wt=json&indent=true HTTP/1.1" 200 4200 "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"

My question is do I need to create customized filters to capture every log line variation or is there a more sane approach?

Cheers!

Note the characters that are causing the parsing to choke are marked with **.

Is the simple answer, yes I need to create customized grok filters for every log line variation?

Yes. You will need to create patterns that match the variations as field are only extracted on successful match.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.