Converting a multiline mysql query into a single line query

Good day,

I am looking for a solution to trim multiple white spaces, removing new lines, and a special character "^M" (in my case) from a multiline mysql query (Not multiline mysql-slow.log) in order to get the mysql query into a single line. The scenario is as follows:

  • I have a web application using mysql
  • I have enabled mysql-slow.log
  • I have installed filebeat and enabled the mysql module
  • I have loaded the corresponding filebeat mysql ingest pipeline to Elasticsearch (Not using Logstash)
  • The multiline logs from mysql-slow.log are parsed properly

The problem:

  • The parsed query is stored in the structured field "mysql.slowlog.query" is stored as a multiline query. Sometimes is stored as a single line but is not always the case.

I need this mysql query to be parsed and stored into the field "mysql.slowlog.query" as a single line query. As an example, from the multiline log below, I want all the parameters in the multiline log parsed into their corresponding field as they are with the exception of "mysql.slowlog.query"

# Time: 2020-08-05T17:14:07.330582Z
# User@Host: root[root] @ localhost [127.0.0.1]  Id:    45
# Query_time: 0.000290  Lock_time: 0.000138 Rows_sent: 1  Rows_examined: 23
SET timestamp=1596647647;
SELECT * FROM accounts ^M
                        WHERE username='jeremy' AND password='password';

I want the field "mysql.slowlog.query" to provide the following:

SELECT * FROM accounts WHERE username='jeremy' AND password='password';

From what I can observe, the loaded ingest pipeline " filebeat-7.6.2-mysql-slowlog-pipeline" extracts the query into the "mysql.slowlog.query" field by using the following grok pattern:

%{GREEDYMULTILINE:mysql.slowlog.query}

Where the current pattern definitions are set:

"GREEDYMULTILINE": "(.|\n)*",
"METRICSPACE": "([ #\n]*)",
"EXPLAIN": "(# explain:.*\n|#\\s*\n)*"
  • What would be the solution to trim multiple spaces to a single one, new lines, and in my case the "^M" located at the end of the first line?
  • Can the proposed solution also work for the following example:
SELECT *
FROM   `users`
WHERE  `user_id` IN (SELECT f2.`friend_user_id`
                     FROM   `friends` AS f1
                            INNER JOIN `friends` AS f2
                              ON f1.`friend_user_id` = f2.`user_id`
                     WHERE  f2.`is_page` = 0
                            AND f1.`user_id` = "%1$d"
                            AND f2.`friend_user_id` != "%1$d"
                            AND f2.`friend_user_id` NOT IN (SELECT `friend_user_id`
                                                            FROM   `friends`
                                                            WHERE  `user_id` = "%1$d"))
       AND `user_id` NOT IN (SELECT `user_id`
                             FROM   `friend_requests`
                             WHERE  `friend_user_id` = "%1$d")
       AND `user_image` IS NOT NULL
ORDER  BY RAND() 

I greatly appreciate your help.

I have found a quick solution for the first case by adding the following into the gsub processor into the pipeline:

  {
    "gsub": {
      "field": "mysql.slowlog.query",
      "pattern": "\\^M\\n\\s*",
      "replacement": ""
    }
  }

The only problem is that it doesn't apply to the last case. I tried adding the following two gsub processors but it messed up my results for my 1st case and ended in an error for the second case. It seems that the order in which you state the gsub processors matters:

  {
    "gsub": {
      "field": "mysql.slowlog.query",
      "pattern": "\\s*",
      "replacement": " "
    }
  },
  {
    "gsub": {
      "field": "mysql.slowlog.query",
      "pattern": "\\n",
      "replacement": " "
    }
  }

Results for case 1:

S E L E C T  *  F R O M  a c c o u n t s  W H E R E  u s e r n a m e = ' j e r e m y '  A N D  p a s s w o r d = ' p a s s w o r d ' ; 

Or the following depending on the order of the gsub processors:

S E L E C T  *  F R O M  a c c o u n t s  ^ M  W H E R E  u s e r n a m e = ' j e r e m y '  A N D  p a s s w o r d = ' p a s s w o r d ' ; 

For the second case, the results where the following:

"message": "# User@Host: root[root] @ localhost [127.0.0.1]  Id:    45\n# Query_time: 0.000290  Lock_time: 0.000138 Rows_sent: 1  Rows_examined: 23\nSELECT *\nFROM   `users`\nWHERE  `user_id` IN (SELECT f2.`friend_user_id`\n                     FROM   `friends` AS f1\n                            INNER JOIN `friends` AS f2\n                              ON f1.`friend_user_id` = f2.`user_id`\n                     WHERE  f2.`is_page` = 0\n                            AND f1.`user_id` = \"%1$d\"\n                            AND f2.`friend_user_id` != \"%1$d\"\n                            AND f2.`friend_user_id` NOT IN (SELECT `friend_user_id`\n                                                            FROM   `friends`\n                                                            WHERE  `user_id` = \"%1$d\"))\n       AND `user_id` NOT IN (SELECT `user_id`\n                             FROM   `friend_requests`\n                             WHERE  `friend_user_id` = \"%1$d\")\n       AND `user_image` IS NOT NULL\nORDER  BY RAND()",
    "fileset": {
      "name": "slowlog"
    },
    "error": {
      "message": "Provided Grok expressions do not match field value: [# User@Host: root[root] @ localhost [127.0.0.1]  Id:    45\\n# Query_time: 0.000290  Lock_time: 0.000138 Rows_sent: 1  Rows_examined: 23\\nSELECT *\\nFROM   `users`\\nWHERE  `user_id` IN (SELECT f2.`friend_user_id`\\n                     FROM   `friends` AS f1\\n                            INNER JOIN `friends` AS f2\\n                              ON f1.`friend_user_id` = f2.`user_id`\\n                     WHERE  f2.`is_page` = 0\\n                            AND f1.`user_id` = \\\"%1$d\\\"\\n                            AND f2.`friend_user_id` != \\\"%1$d\\\"\\n                            AND f2.`friend_user_id` NOT IN (SELECT `friend_user_id`\\n                                                            FROM   `friends`\\n                                                            WHERE  `user_id` = \\\"%1$d\\\"))\\n       AND `user_id` NOT IN (SELECT `user_id`\\n                             FROM   `friend_requests`\\n                             WHERE  `friend_user_id` = \\\"%1$d\\\")\\n       AND `user_image` IS NOT NULL\\nORDER  BY RAND()]"
    }

Any suggestions?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.