Hi All,

So I have the following GROK filter going for my Apache logs.

else if "apache" in [tags] {
    grok {
      match => { "message" => '%{COMBINEDAPACHELOG}

This is working for ~75% of ingested logs but I seem to have some Apache log files that have slightly different formats. Some contain a port number next to hostname (hostname:8080) and some have random characters sprinkled in for some reason ('/'). Examples below with the odd message being the first and the characters causing grok failures in BOLD .**:8030** - [10/Oct/2018:11:45:00 -0700] **\**"GET /solr-chipcode/collection1/replication?command=indexversion&wt=javabin&qt=%2Freplication&version=2 HTTP/1.1\" 200 76 **\**"-**\**" **\**"Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0**\**" - [09/Oct/2018:13:45:55 -0700] "GET /solr-cpip-global/cpip-directory-path/replication?command=indexversion&wt=javabin&qt=%2Freplication&version=2 HTTP/1.1" 200 80 "-" "Solr[org.apache.solr.client.solrj.impl.HttpSolrClient] 1.0", - [09/Oct/2018:13:52:09 -0700] "GET /solr-prism/collection1/select?q=*%3A*&wt=json&indent=true HTTP/1.1" 200 4200 "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows NT)"

My question is do I need to create customized filters to capture every log line variation or is there a more sane approach?



Note the characters that are causing the parsing to choke are marked with **.


Is the simple answer, yes I need to create customized grok filters for every log line variation?

(Christian Dahlqvist) #4

Yes. You will need to create patterns that match the variations as field are only extracted on successful match.

(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.