I have been trying to use the Tomcat module for filebeat 7.9 (Tomcat module | Filebeat Reference [7.9] | Elastic) to ingest tomcat access logs from files but my log files are producing errors. I'd like to know what format the module expects the file-based logs to be in.
Tomcat log format is configured in server.xml and the default format out of the box is "%h %l %u %t "%r" %s %b", with fields defined as
- 
%h- Remote IP address - 
%l- Remote logical username from identd (always returns '-') - 
%u- Remote user that was authenticated (if any), else '-' (escaped if required) - 
%t- Date and time, in Common Log Format - 
%r- First line of the request (method and request URI) - 
%s- HTTP status code of the response - 
%b- Bytes sent, excluding HTTP headers, or '-' if zero 
https://tomcat.apache.org/tomcat-9.0-doc/config/valve.html#Access_Log_Valve
(This is version 9, but the format appears to be the same in Tomcat 10.) This results in logs that look like:
111.22.0.333 - - [07/May/2021:17:08:49 -0400] "GET /app-path?name=john HTTP/1.1" 302 -
111.33.0.111 - - [07/May/2021:17:08:49 -0400] "GET /app-path/ HTTP/1.1" 200 14291
111.44.0.111 - - [07/May/2021:17:08:50 -0400] "GET /favicon.ico HTTP/1.1" 404 762
My module config looks like:
- module: tomcat
  log:
    enabled: true
    # Set which input to use between udp (default), tcp or file.
    var.input: file
    # Set paths for the log files when file input is used.
    var.paths:
     - /usr/local/tomcat/logs/localhost_access_log.*
When the logs are ingested into elastic they clearly have not been properly parsed and have a dissect_parsing_error flag, e.g.:
{
  "_index": "filebeat-7.9.2-default-2021.03.20-000004",
  "_type": "_doc",
  "_id": "CWKpSHkBuDbbQBpRlerr",
  "_score": 1,
  "_source": {
    "agent": {
      "hostname": "myserver.domain.com",
      "name": "myserver.domain.com",
      "id": "<hex string>",
      "type": "filebeat",
      "ephemeral_id": "<hex string>",
      "version": "7.9.2"
    },
    "log": {
      "file": {
        "path": "/usr/local/tomcat/logs/localhost_access_log.2021-05-07.txt"
      },
      "offset": 243,
      "flags": [
        "dissect_parsing_error"
      ]
    },
    "fileset": {
      "name": "log"
    },
    "tags": [
      "tomcat.log",
      "forwarded"
    ],
    "observer": {
      "product": "TomCat",
      "vendor": "Apache",
      "type": "Web"
    },
    "input": {
      "type": "log"
    },
    "@timestamp": "2021-05-07T21:09:03.105Z",
    "ecs": {
      "version": "1.5.0"
    },
    "service": {
      "type": "tomcat"
    },
    "event": {
      "original": "111.12.0.333 - - [07/May/2021:17:08:57 -0400] \"GET /app-path?name=john HTTP/1.1\" 200 527",
      "module": "tomcat",
      "dataset": "tomcat.log"
    }
  },
  "fields": {
    "@timestamp": [
      "2021-05-07T21:09:03.105Z"
    ]
  }
}
I tried turning on debug in the Tomcat module config and I get the following messages in the filebeat log:
2021-05-06T21:49:47.306-0400	WARN	[processor.javascript]	console/console.go:52	msgid_select: no messageid captured!
2021-05-06T21:49:47.307-0400	WARN	[processor.javascript]	console/console.go:52	linear_select trying entry 0
2021-05-06T21:49:47.307-0400	WARN	[processor.javascript]	console/console.go:52	linear_select failed entry 0
2021-05-06T21:49:47.307-0400	WARN	[processor.javascript]	console/console.go:52	linear_select trying entry 1
2021-05-06T21:49:47.307-0400	WARN	[processor.javascript]	console/console.go:52	linear_select failed entry 1
2021-05-06T21:49:47.307-0400	WARN	[processor.javascript]	console/console.go:52	linear_select didn't match
Any help would be appreciated. If someone could point me to an example of a working file-based config that would be great.