Logstash grok multiline file

Hello, I read a lot of articles here, but I still don’t understand how to properly configure grok in logstash, here is an example file:

# Time: 2021-02-17T15:19:22.121290Z
# User@Host: app[app] @  [192.168.100.1]  Id: 4321231
# Query_time: 0.000022  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0
SET timestamp=143241243;
SET autocommit=1;
# Time: 2021-01-17T11:12:22.125433222Z
# User@Host: app[app] @  [192.168.100.2]  Id: 543434
# Query_time: 0.000027  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0
SET timestamp=32434134;
SELECT @@session.tx_isolation;

How can you understand the file starts with "# Time" and ends with "SELECT" (instead of a select, there may be a slightly longer string, I need it in its entirety, I don’t need to get anything from there)
And yes, I do not need 3 and 4 lines (that is, query and set )
I have parsed these lines separately, but when logstash is launched, it does not parse them, can anyone help me make a config for logstash ?
Grok example for lines one through five:

^%{DATA} %{DATA}\:%{SPACE}%{GREEDYDATA:time}
^%{DATA} %{DATA}\:%{DATA}\@  \[%{IPV4:user_ip}]%{DATA}\: %{NUMBER:id}
%{GREEDYDATA}
%{GREEDYDATA}
%{GREEDYDATA:set_select}\;

Are you using a multiline codec to ingest the file as a single event? What does your grok configuration look like? Are you trying to do multiple match with break_on_match set to false? Are you trying to do a single multiline match?

@Badger there were many configurations, one of the last:

input {
  file {
         path => "/home/user/file.log"
    start_position => "beginning"
  }
}

filter {
        grok {
        match => { "message" => ['^%{DATA} %{DATA}\:%{SPACE}%{GREEDYDATA:time}', 
         '^%{DATA} %{DATA}\:%{DATA}\@  \[%{IPV4:user_ip}]%{DATA}\: %{NUMBER:id}', 
         '%{GREEDYDATA}', '%{GREEDYDATA}', 
         '%{GREEDYDATA:set_select}\;'] }
 }
}

break_on_match - did not change this parameter, it means true
I also tried this config:

grok { match => { "message" => [ "Duration: %{NUMBER:duration}", "Speed: %{NUMBER:speed}" ] } }

It didn't seem to work either

A file input treats each line as a separate event. So the first event will only contain "message" => "# Time: 2021-02-17T15:19:22.121290Z". You could use something like

grok {
    match => {
        "message" => [
            "Time: %{NOTSPACE:time}",
            "%{IPV4:user_ip}"
        ]
    }
}

Note that you do not have to match the whole line. That second pattern will pick an IP address out of any line that contains one. Anchoring patterns is a good habit to get into, but it is not always the best approach.

You could use a multiline codec to pick up each of those two log entries as a single event

file {
    path => "/home/user/test.txt"
    sincedb_path => "/dev/null"
    start_position => beginning
    codec => multiline {
        pattern => "^# Time:"
        negate => true
        what => previous
        auto_flush_interval => 2
        multiline_tag => ""
    }
}

Which will produce

   "message" => "# Time: 2021-02-17T15:19:22.121290Z\n# User@Host: app[app] @  [192.168.100.1]  Id: 4321231\n# Query_time: 0.000022  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0\nSET timestamp=143241243;\nSET autocommit=1;"
   "message" => "# Time: 2021-01-17T11:12:22.125433222Z\n# User@Host: app[app] @  [192.168.100.2]  Id: 543434\n# Query_time: 0.000027  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 0\nSET timestamp=32434134;\nSELECT @@session.tx_isolation;"

The you can use a grok filter

    grok {
        break_on_match => false
        match => {
            "message" => [
                "Time: %{NOTSPACE:time}",
                "%{IPV4:user_ip}",
                "^#[^\n]+\n(?<allSQL>[^#]+)\Z"
            ]
        }
    }

which will produce

   "user_ip" => "192.168.100.2"
    "allSQL" => "SET timestamp=32434134;\nSELECT @@session.tx_isolation;",
      "time" => "2021-01-17T11:12:22.125433222Z",

If you just want the last line of the SQL then change the last pattern to

                "\n(?<lastSQL>[^\n]+)\Z"
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.