Grock filter to parse SQL query

wafaAG · June 20, 2018, 2:56pm

Hi,

i have some events on application logs that i want to parse (which logging slow SQL query), the events is like that :

19-Jun-2018 15:16:28.670 WARNING [ajp-nio-8009-exec-67] org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery Slow Query Report SQL=select * from ( select a.*, ROWNUM rnum from (
        select distinct RES.*

        from ACT_HI_TASKINST RES










     order by RES.PRIORITY_ desc, RES.START_TIME_ asc
      ) a where ROWNUM < ?) where rnum  >= ?; time=10212 ms;
19-Jun-2018 15:16:29.431 WARNING [ajp-nio-8009-exec-74] org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery Slow Query Report SQL=/* SELECT COUNT(*) FROM DocumentAuto docAuto WHERE docAuto.etat.typeEtat.code = :typeEtat AND docAuto.familleMetier.id IN :familleIds  */ select count(*) as col_0_0_ from rsauto_prod.gdm_doc_auto documentau0_ inner join rsauto_prod.gdm_document documentau0_1_ on documentau0_.id=documentau0_1_.id, rsauto_prod.ETA_ETAT_ENTITE etat1_, rsauto_prod.eta_ref_type_etat reftypeeta2_ where documentau0_1_.etat_id=etat1_.id and etat1_.type_etat_id=reftypeeta2_.id and reftypeeta2_.CODE=? and (documentau0_1_.REF_LOT_GED_METIER_ID in (? , ?)); time=3043 ms;
19-Jun-2018 15:16:33.998 WARNING [ajp-nio-8009-exec-71] org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery Slow Query Report SQL=select * from ( select a.*, ROWNUM rnum from (
    select distinct RES.*

    from ACT_RU_TASK RES


       inner
      join ACT_RU_IDENTITYLINK I on I.TASK_ID_ = RES.ID_







     WHERE  RES.ASSIGNEE_ is null
        and I.TYPE_ = 'candidate'
        and
        (



            I.GROUP_ID_ IN
            (
              ?
            ,
              ?
            ,
              ?
            )

        )


     order by RES.PRIORITY_ desc, RES.CREATE_TIME_ asc
      ) a where ROWNUM < ?) where rnum  >= ?; time=2181 ms;'

** i used this config file :**

input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
#   mutate { gsub => [ "message", "^$", "" ] }
  if [message] =~ /^\s*$/ {
      drop { }
    }
  grok {
    match => { "message" => '%{MONTHDAY}-%{MONTH}-%{YEAR}\s%{TIME}\s%{DATA:level}\s\[%{DATA:thread}\]\s%{DATA:class}\s%{GREEDYDATA:message}' }
  }
}

the result is this :

my quetsion is why i get this _grokparsefailure, in kibana i match all the fileds but i still get the grokparsefailure

thanks in advance.

Badger · June 20, 2018, 3:11pm

Please do not post text using images. Just post the text.

The default codec for a stdin input consumes input one line at a time. For a line like

19-Jun-2018 15:16:28.670 WARNING [ajp-nio-8009-exec-67] org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery Slow Query Report SQL=select * from ( select a.*, ROWNUM rnum from (

That matches the grok pattern successfully. Since you named the trailing GREEDYDATA message, and the event already has a field called message, this results in message becoming an array.

For the next line of the file

         select distinct RES.*

The grok pattern does not match, so you get a _grokparsefailure.

You should look at the multiline codec, so that you can combine every line of the SQL into a single event.

wafaAG · June 20, 2018, 5:27pm

Hi,

Thank you for your help.

i changed the config file :

input { stdin {
codec => multiline {
  pattern => "^%{MONTHDAY}-%{MONTH}-%{YEAR}"
  negate => true
  what => previous
}
 } }
output { stdout { codec => rubydebug } }
filter {
  grok {
match => { "message" => '%{MONTHDAY}-%{MONTH}-%{YEAR}\s%{TIME}\s%{DATA:level}\s\[%{DATA:thread}\]\s%{DATA:class}\s%{GREEDYDATA:dt}' }
  }
}

but this time i get only two events from the 3 events in the log :

[root@elk_dev tmp]# cat payload | /usr/share/logstash/bin/logstash -f config.tst
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path //usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
{
            "dt" => "Slow Query Report SQL=select * from ( select a.*, ROWNUM rnum from (\n    select distinct RES.*\n\n    from ACT_HI_TASKINST RES\n\n\n\n\n\n\n\n\n\n\n     order by RES.PRIORITY_ desc, RES.START_TIME_ asc\n      ) a where ROWNUM < ?) where rnum  >= ?; time=10212 ms;",
    "@timestamp" => 2018-06-20T16:17:01.142Z,
         "level" => "WARNING",
      "@version" => "1",
          "host" => "elk_dev",
        "thread" => "ajp-nio-8009-exec-67",
       "message" => "19-Jun-2018 15:16:28.670 WARNING [ajp-nio-8009-exec-67] org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery Slow Query Report SQL=select * from ( select a.*, ROWNUM rnum from (\n    select distinct RES.*\n\n    from ACT_HI_TASKINST RES\n\n\n\n\n\n\n\n\n\n\n     order by RES.PRIORITY_ desc, RES.START_TIME_ asc\n      ) a where ROWNUM < ?) where rnum  >= ?; time=10212 ms;",
         "class" => "org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery",
          "tags" => [
        [0] "multiline"
    ]
}
{
            "dt" => "Slow Query Report SQL=/* SELECT COUNT(*) FROM DocumentAuto docAuto WHERE docAuto.etat.typeEtat.code = :typeEtat AND docAuto.familleMetier.id IN :familleIds  */ select count(*) as col_0_0_ from rsauto_prod.gdm_doc_auto documentau0_ inner join rsauto_prod.gdm_document documentau0_1_ on documentau0_.id=documentau0_1_.id, rsauto_prod.ETA_ETAT_ENTITE etat1_, rsauto_prod.eta_ref_type_etat reftypeeta2_ where documentau0_1_.etat_id=etat1_.id and etat1_.type_etat_id=reftypeeta2_.id and reftypeeta2_.CODE=? and (documentau0_1_.REF_LOT_GED_METIER_ID in (? , ?)); time=3043 ms;",
    "@timestamp" => 2018-06-20T16:17:01.146Z,
         "level" => "WARNING",
      "@version" => "1",
          "host" => "elk_dev",
        "thread" => "ajp-nio-8009-exec-74",
       "message" => "19-Jun-2018 15:16:29.431 WARNING [ajp-nio-8009-exec-74] org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery Slow Query Report SQL=/* SELECT COUNT(*) FROM DocumentAuto docAuto WHERE docAuto.etat.typeEtat.code = :typeEtat AND docAuto.familleMetier.id IN :familleIds  */ select count(*) as col_0_0_ from rsauto_prod.gdm_doc_auto documentau0_ inner join rsauto_prod.gdm_document documentau0_1_ on documentau0_.id=documentau0_1_.id, rsauto_prod.ETA_ETAT_ENTITE etat1_, rsauto_prod.eta_ref_type_etat reftypeeta2_ where documentau0_1_.etat_id=etat1_.id and etat1_.type_etat_id=reftypeeta2_.id and reftypeeta2_.CODE=? and (documentau0_1_.REF_LOT_GED_METIER_ID in (? , ?)); time=3043 ms;",
         "class" => "org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery"
}
[root@elk_dev tmp]#

Regards.

Badger · June 20, 2018, 5:41pm

With that multiline codec the end of the third event is defined by the start of the fourth event. But there is no fourth event, so the third event never ends. Look at the auto_flush_interval option for the multiline codec.

wafaAG · June 21, 2018, 11:26am

Hi,

Thank you for your response.
now the parsing is correct if i excute this in command line :

{
    "@timestamp" => 2018-06-21T09:59:41.366Z,
         "level" => "WARNING",
      "@version" => "1",
          "host" => "elk_dev",
        "thread" => "ajp-nio-8009-exec-67",
       "message" => "Slow Query Report SQL=select * from ( select a.*, ROWNUM rnum from (\n    select distinct RES.*\n\n    from ACT_HI_TASKINST RES\n\n\n\n\n\n\n\n\n\n\n     order by RES.PRIORITY_ desc, RES.START_TIME_ asc\n      ) a where ROWNUM < ?) where rnum  >= ?; time=10212 ms;",
         "class" => "org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery",
          "tags" => [
        [0] "multiline"
    ]
}
{
    "@timestamp" => 2018-06-21T09:59:41.369Z,
         "level" => "WARNING",
      "@version" => "1",
          "host" => "elk_dev",
        "thread" => "ajp-nio-8009-exec-74",
       "message" => "Slow Query Report SQL=/* SELECT COUNT(*) FROM DocumentAuto docAuto WHERE docAuto.etat.typeEtat.code = :typeEtat AND docAuto.familleMetier.id IN :familleIds  */ select count(*) as col_0_0_ from rsauto_prod.gdm_doc_auto documentau0_ inner join rsauto_prod.gdm_document documentau0_1_ on documentau0_.id=documentau0_1_.id, rsauto_prod.ETA_ETAT_ENTITE etat1_, rsauto_prod.eta_ref_type_etat reftypeeta2_ where documentau0_1_.etat_id=etat1_.id and etat1_.type_etat_id=reftypeeta2_.id and reftypeeta2_.CODE=? and (documentau0_1_.REF_LOT_GED_METIER_ID in (? , ?)); time=3043 ms;",
         "class" => "org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport.reportSlowQuery"
}

if i passe the input by a filebeat with this configuration :

filebeat:
  prospectors:
    - paths:
        - /app/app1/tomcat/logs/catalina*.log
      input_type: log
      fields:
        app_id: app1
        log_type: server-logs
      # Mutate logs
      multiline.pattern: '^[0-9]{2}-[a-zA-Z]{3}-[0-9]{4}'
      multiline.negate: true
      multiline.match: after

i still get this _grokparsefailure; the multline filter configuration are the same as those in input or i am missing some thing ?

Regards.

system · July 19, 2018, 11:26am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Use grok to filter mysql slow-queries Logstash	2	1159	March 20, 2019
Too many events Logstash	7	329	April 3, 2019
Mysql slow log Logstash	2	174	October 24, 2023
Grok filter cant split error message and java stacktrace into fields Logstash	2	1965	July 10, 2017
[Logstash] Grok Filter does not work _grokparsefailure Logstash	16	599	March 22, 2022

Grock filter to parse SQL query

Related Topics