Logstash grok multiline file

Hello, I read a lot of articles here, but I still don’t understand how to properly configure grok in logstash, here is an example file:

# Time: 2021-02-17T15:19:22.121290Z
# User@Host: app[app] @  []  Id: 4321231
# Query_time: 0.000022  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0
SET timestamp=143241243;
SET autocommit=1;
# Time: 2021-01-17T11:12:22.125433222Z
# User@Host: app[app] @  []  Id: 543434
# Query_time: 0.000027  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=32434134;
SELECT @@session.tx_isolation;

How can you understand the file starts with "# Time" and ends with "SELECT" (instead of a select, there may be a slightly longer string, I need it in its entirety, I don’t need to get anything from there)
And yes, I do not need 3 and 4 lines (that is, query and set )
I have parsed these lines separately, but when logstash is launched, it does not parse them, can anyone help me make a config for logstash ?
Grok example for lines one through five:

^%{DATA} %{DATA}\:%{DATA}\@  \[%{IPV4:user_ip}]%{DATA}\: %{NUMBER:id}

Are you using a multiline codec to ingest the file as a single event? What does your grok configuration look like? Are you trying to do multiple match with break_on_match set to false? Are you trying to do a single multiline match?

@Badger there were many configurations, one of the last:

input {
  file {
         path => "/home/user/file.log"
    start_position => "beginning"

filter {
        grok {
        match => { "message" => ['^%{DATA} %{DATA}\:%{SPACE}%{GREEDYDATA:time}', 
         '^%{DATA} %{DATA}\:%{DATA}\@  \[%{IPV4:user_ip}]%{DATA}\: %{NUMBER:id}', 
         '%{GREEDYDATA}', '%{GREEDYDATA}', 
         '%{GREEDYDATA:set_select}\;'] }

break_on_match - did not change this parameter, it means true
I also tried this config:

grok { match => { "message" => [ "Duration: %{NUMBER:duration}", "Speed: %{NUMBER:speed}" ] } }

It didn't seem to work either

A file input treats each line as a separate event. So the first event will only contain "message" => "# Time: 2021-02-17T15:19:22.121290Z". You could use something like

grok {
    match => {
        "message" => [
            "Time: %{NOTSPACE:time}",

Note that you do not have to match the whole line. That second pattern will pick an IP address out of any line that contains one. Anchoring patterns is a good habit to get into, but it is not always the best approach.

You could use a multiline codec to pick up each of those two log entries as a single event

file {
    path => "/home/user/test.txt"
    sincedb_path => "/dev/null"
    start_position => beginning
    codec => multiline {
        pattern => "^# Time:"
        negate => true
        what => previous
        auto_flush_interval => 2
        multiline_tag => ""

Which will produce

   "message" => "# Time: 2021-02-17T15:19:22.121290Z\n# User@Host: app[app] @  []  Id: 4321231\n# Query_time: 0.000022  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0\nSET timestamp=143241243;\nSET autocommit=1;"
   "message" => "# Time: 2021-01-17T11:12:22.125433222Z\n# User@Host: app[app] @  []  Id: 543434\n# Query_time: 0.000027  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0\nSET timestamp=32434134;\nSELECT @@session.tx_isolation;"

The you can use a grok filter

    grok {
        break_on_match => false
        match => {
            "message" => [
                "Time: %{NOTSPACE:time}",

which will produce

   "user_ip" => ""
    "allSQL" => "SET timestamp=32434134;\nSELECT @@session.tx_isolation;",
      "time" => "2021-01-17T11:12:22.125433222Z",

If you just want the last line of the SQL then change the last pattern to


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.