Logstash Grok Pattern-Problem with finding every ")"

Hi there and thanks for your time in advance.

I did build my first grok-pattern myself to differ some fields in my events.
The filter looks like this:

  filter {
    grok {
      match => { 
		"message" => "\A%{TIMESTAMP_ISO8601:fields.timestamp}%{SPACE}%{LOGLEVEL:fields.loglevel}%{SPACE}%{NOTSPACE:fields.loggername}%{SPACE}\(%{GREEDYDATA:fields.thread}**\\)%{SPACE}**%{GREEDYDATA:fields.message}" 
	  }
    }	
  }

This does work perfectly on most given lines, but in some of them, the filter does not find the bolt part corretly, even if there would be such a pattern.
Working Excamples:


2019-07-24 08:51:59,209 INFO [some.AbstractBean] (default task-114) check media url: https://somewhere.com:443/somewhere//@124766750.jpg

gets split into:

fields.thread: default·task-114
fields.timestamp: 2019-07-24·08:51:59,209
fields.loggername: [some.AbstractBean]
fields.loglevel: INFO
fields.message: check·media·url:·htps://somewhere.com:443/somewhere//@124766750.jpg

BUT:

2019-07-24 08:51:59,232 INFO [some.Positionscontainer] (default task-114) Time processWert-getArtikelpreisAusStaffel: 2019-07-24 08:51:59.232 katId: 2320 , artikel.getSupplier_aid():124766750 ,anzahl: 2

does get split at the wrong ) :

    fields.thread: default·task-114)·Time·processWert-getArtikelpreisAusStaffel:·2019-07-24·08:51:59.232·katId:·2320·,·artikel.getSupplier_aid(
    fields.timestamp: 2019-07-24·08:51:59,232
    fields.loggername: [some.Positionscontainer]
    fields.loglevel: INFO
    fields.message: :124766750·,anzahl:·2

instead, grok splits it at the second closing bracet (and does eat it). this happens not once but more often (like 10% of the lines) and i got no clue, why.
i did let the programmers check their code of the logger to see if there are some extra-character - but there aint.

Does anyone of you got an idea, why this happens? What am i doing wrong here?

Noone got an idea? Am i maybe using the Filter wrong or the wrong patterns?
Is even clear, what the problem is here?

Please edit you post. Select the configuration and click on </> in the toolbar above the edit pane. In the preview pane on the right you will see it formatted

like
    this. With indentation preserved.

Then do the same for the log file entries.

done

Well that gets me

#<RegexpError: unmatched close parenthesis:

but if I make the most probable corrections I can reproduce your results. Change the GREEDYDATA inside the \( \) to just DATA. You do not want it to be greedy.

1 Like

That helped! You can't imagine how happy I am that this got solved. Thanks a lot :smiley:

This is what it looks like in the end:

  filter {
    grok {
      match => { 
		"message" => "\A%{TIMESTAMP_ISO8601:fields.timestamp}%{SPACE}%{LOGLEVEL:fields.loglevel}%{SPACE}%{NOTSPACE:fields.loggername}%{SPACE}\(%{DATA:fields.thread}\)%{SPACE}%{GREEDYDATA:fields.message}" 
	  }
    }	
  }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.