Hi All,
I am having a lot of problems parsing logs especially with different dates and logs having multiline tags.
For example:
A head (very first 10 lines) of one of my log files, specifically, catalina.out, could be:
NOTE: Picked up JDK_JAVA_OPTIONS: --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED
[0.001s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/path/to/gc.log instead.
12-Sep-2023 20:24:49.876 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin Match [Server] failed to set property [debug] to [0]
[root@DRACO4614424 config]#
12-Sep-2023 20:24:49.921 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin Match [Server/Service/Connector] failed to set property [debug] to [0]
12-Sep-2023 20:24:50.231 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Tomcat-Standalone]
12-Sep-2023 20:24:50.231 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.76]
etc
etc
...
..
...This next 4 lines I want to go into 1 logline (multiline) until the next line starting with a date.
12-Sep-2023 20:24:51.848 INFO [Catalina-utility-2] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
12-Sep-2023 20:24:51.877 INFO [Catalina-utility-2] org.apache.catalina.core.ApplicationContext.log 1 Spring WebApplicationInitializers detected on classpath
12-Sep-2023 20:24:52.041 INFO [Catalina-utility-2] org.apache.catalina.core.ApplicationContext.log Initializing Spring DispatcherServlet 'dispatcher'
Here you can see some lines start with a date and some don't !
I guess the lines that don't start with a date have to be dropped.
I want each line that starts with the date to slurp all other lines that follow it into the same line until another line starts with a date, so:
In order to do that, I have to use the multiline codec, so I tried this, which does work elsewhere, just not here:
input {
file {
path => "/path/to/logs/catalina.*"
start_position => "beginning"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601}"
negate => true
what => "previous"
}
}
}
With the following filter:
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{GREEDYDATA:message}"
}
}
if "_grokparsefailure" in [tags] {
drop { }
}
#dissect {
# mapping => { "message" => "%{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{message}" }
#}
#if "_dissectfailure" in [tags] {
# drop { }
#}
#if "_dateparsefailure" in [tags] {
# drop { }
#}
}
output {
file {
path => "/path/to/log.out"
}
}
whether I use the grok or dissect
The pipeline starts but no output file created...!
The input file does get data coming in every few seconds, I have checked with a tail.
Really confused.
Please tell me what's the best way to parse these kind of multiple line logs using logstash?
I strongly suspect it's something to do with the date parsing though I might be wrong.
Appreciate all your help,
Thanks again...