nkknkk
February 3, 2025, 10:40pm
1
I am trying to parse my sample catalina.out file.
here is my tomcat.conf file i am giving to the logstash
./logstash -f conf.d/tomcat.conf -r
here is my tomcat.conf
input {
file {
path => "/elkstack/logs/input/catalina-sample.out"
codec => multiline {
pattern => "^\[<\d{1,2}"
negate => "true"
what => "previous"
}
}
}
filter {
grok {
match => { "message" => "%{GREEDYDATA:message}" }
}
output {
stdout{}
}
i am not seeing any parsing or any results. what am i doing wrong?
leandrojmp
(Leandro Pereira)
February 3, 2025, 11:17pm
2
What is the result you are getting? Please share the output from Logstash.
Also, your grok filter is doing nothing, so your output will be the same as the original message in your log file.
stephenb
(Stephen Brown)
February 3, 2025, 11:26pm
3
And if the file has been read already, it will not be reread unless you set
sincedb_path => "/dev/null"
in the input file config.
input {
file {
path => "/elkstack/logs/input/catalina-sample.out"
sincedb_path => "/dev/null"
codec => multiline {
pattern => "^\[<\d{1,2}"
negate => "true"
what => "previous"
}
}
}
Badger
February 4, 2025, 3:09am
4
Also, you may need to set auto_flush_interval on the codec.
nkknkk
February 10, 2025, 5:50pm
5
thank you Stephen. This solution seems to be working for a static file. however, I have the log file generated dynamically and the log file gets deleted after every restart of the tomcat.
now, the parsing rules are only getting the lines that starts with d{1,2}
. it is completely ignoring the other lines that starts with [<
. I have no clue why is this behavior?
This is how my config file looks like.
# cat elkstack.conf
input {
file {
path => "/opt/app/logs/catalina.out"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => multiline {
pattern => "^(\[<\d{1,2}|\d{2}-\w{3}-\d{4})"
negate => "true"
what => "previous"
auto_flush_interval => 5
}
}
}
filter {
mutate {
remove_field => [ "host", "@version", "event" ]
}
}
output {
elasticsearch {
hosts => ["http://elkstack:9200"]
index => "dev01"
}
}
stephenb
(Stephen Brown)
February 10, 2025, 6:03pm
6
Are you sure the generated files are the same patterns ... post a sample (anonymize) more often than not the pattern
is incorrect
nkknkk
February 10, 2025, 6:13pm
7
here is the sample log
[<2 10, 2025 11:17:15 AM>:
Yr=2025
]
[<2 10, 2025 11:17:15 AM>: CurrentYear: close statement]
10-Feb-2025 11:17:09.201 WARNING [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
[<2 10, 2025 11:18:15 AM>:
Yr=2025
]
[<2 10, 2025 11:18:15 AM>: CurrentYear: close statement]
10-Feb-2025 11:17:09.201 WARNING [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
stephenb
(Stephen Brown)
February 10, 2025, 6:23pm
8
And why not just
pattern => "^(\[<|\d{2}-\w{3}-\d{4})"
According to regex101.com that should work... I see you have single line as well... hmm... perhaps that is the issue
I will try on my side...
Can you supply a sample that has both types please...
stephenb
(Stephen Brown)
February 10, 2025, 6:53pm
9
Ok This ...
codec => multiline {
pattern => "^\["
negate => "true"
what => "previous"
}
Works for these
[<2 10, 2025 11:17:15 AM>:
Yr=2025
]
[<2 10, 2025 11:17:15 AM>: CurrentYear: close statement]
10-Feb-2025 11:17:09.201 WARNING [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
[<2 10, 2025 11:18:15 AM>:
Yr=2024
]
[<2 10, 2025 11:18:15 AM>: CurrentYear: close statement]
11-Feb-2025 11:17:09.201 ERROR [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
]
[<2 10, 2025 11:18:15 AM>:
Yr=2024
]
Output
{
"@timestamp" => 2025-02-10T18:53:18.165960Z,
"message" => "[<2 10, 2025 11:18:15 AM>: CurrentYear: close statement]\n11-Feb-2025 11:17:09.201 ERROR [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference\n java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection\n]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T18:53:18.165458Z,
"message" => "[<2 10, 2025 11:18:15 AM>:\nYr=2024\n]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T18:53:18.165005Z,
"message" => "[<2 10, 2025 11:17:15 AM>: CurrentYear: close statement]\n10-Feb-2025 11:17:09.201 WARNING [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference\n java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T18:53:18.164136Z,
"message" => "[<2 10, 2025 11:17:15 AM>:\nYr=2025\n]",
"@version" => "1"
}
Now provide a mixed file or is this it
nkknkk
February 10, 2025, 8:27pm
10
that is the mixed sample i gave you.
the first message should only be until : close statement]
message 2 should start with 11-feb-2025
the pattern i gave in the above response works perfectly on a static file.
its just not doing the same thing on a dynamic log file that gets deleted and populated with logs often.
stephenb
(Stephen Brown)
February 10, 2025, 8:29pm
11
Ahh OK... let me check...
OK so this worked for me note missing the (
so it says
Starts with this ^\[
or starts with ^\d{1,2}|\d{2}-\w{3}-\d{4}
codec => multiline {
pattern => "^\[|^\d{1,2}|\d{2}-\w{3}-\d{4}"
negate => "true"
what => "previous"
}
Input
[<2 10, 2025 11:17:15 AM>:
Yr=2025
]
[<2 10, 2025 11:17:15 AM>: CurrentYear: close statement]
10-Feb-2025 11:17:09.201 WARNING [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
[<2 10, 2025 11:18:15 AM>:
Yr=2024
]
[<2 10, 2025 11:18:15 AM>: CurrentYear: close statement]
11-Feb-2025 11:17:09.201 ERROR [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
]
[<2 10, 2025 11:18:15 AM>:
Yr=2024
]
Output
{
"@timestamp" => 2025-02-10T20:31:35.653999Z,
"message" => "[<2 10, 2025 11:18:15 AM>:\nYr=2024\n]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T20:31:36.288355Z,
"message" => "[<2 10, 2025 11:17:15 AM>:\nYr=2025\n]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T20:31:36.290290Z,
"message" => "11-Feb-2025 11:17:09.201 ERROR [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference\n java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection\n]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T20:31:36.289482Z,
"message" => "[<2 10, 2025 11:18:15 AM>:\nYr=2024\n]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T20:31:36.289715Z,
"message" => "[<2 10, 2025 11:18:15 AM>: CurrentYear: close statement]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T20:31:36.288636Z,
"message" => "[<2 10, 2025 11:17:15 AM>: CurrentYear: close statement]",
"@version" => "1"
}
{
"@timestamp" => 2025-02-10T20:31:36.289027Z,
"message" => "10-Feb-2025 11:17:09.201 WARNING [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference\n java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection",
"@version" => "1"
}
nkknkk
February 10, 2025, 8:34pm
12
ok, let me try by removing ( )
stephenb
(Stephen Brown)
February 10, 2025, 8:34pm
13
Try exactly what I have....
nkknkk
February 10, 2025, 8:36pm
14
I can't have exactly what you have. because i have messages that have [
in the middle and i don't want to break the line there. that is the reason i had [<d{1,2}
stephenb
(Stephen Brown)
February 10, 2025, 8:38pm
15
I Don't understand .... '
^
means anchor at the beginning... that is regex
^\[
means only [
at the beginning of the line ... so a [
in the middle will not be matched that is the whole point
You can see it worked that way
note the [
in the middle of the line
{
"@timestamp" => 2025-02-10T20:31:36.290290Z,
"message" => "11-Feb-2025 11:17:09.201 ERROR [main] org.apache.naming.NamingContext.lookup Unexpected exception resolving reference\n java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection\n]",
"@version" => "1"
}
and like wise
^\d{1,2}|\d{2}-\w{3}-\d{4}"
means only at the beginning of the line
nkknkk
February 10, 2025, 8:42pm
16
oh right ok. my bad. I can try exactly what you have
"^\[|^\d{1,2}|\d{2}-\w{3}-\d{4}"
stephenb
(Stephen Brown)
February 10, 2025, 8:47pm
17
And I think you really only need
pattern => "^\[|^\d{1,2}-\w{3}-\d{4}"
because d{1,2}
matches 1 or 2 digits
I tried with
11-Feb-2025
9-Feb-2025
Both matched use this to test and then I test with logstash... both verified
Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.
nkknkk
February 10, 2025, 8:54pm
18
well it works with the static file. I will have to wait for tomorrow to see what happens.
nkknkk
February 10, 2025, 8:56pm
19
This is what is happening like I mentioned before. every day we have a deployment and a server restart. when we do that, the old log file gets deleted and new log file with the same name and format will be recreated with the new content. this is where the problem is happening. i will have to wait for tomorrow and see if this works
stephenb
(Stephen Brown)
February 10, 2025, 9:01pm
20
Sounds good if works on this static file but not on a new file...
Either the file is not the same or somehow you logstash config is altered or something...
You will need to show exactly then perhaps we can help.
IF that is the case I would say open a new thread with an exact subject and very much details...