Mysql-slow query parsing error

Hi,
I am using filebeat and the mysql module to send mysql error logs and mysql slow logs to elastic.

Here's an extract from my configuration :

 #==========================  Modules configuration ============================
 filebeat.modules:
 - module: mysql
   # Error logs
   error:
     enabled: true
   # Slow logs
   slowlog:
     enabled: true

The error logs are parsed, no problem. But the mysql-slow logs fail with the following message in my ES documents :

Provided Grok expressions do not match field value: [# User@Host: myhost[myhost] @ host [ip] Id: 10540316\n# Query_time: 4.065385 Lock_time: 0.000220 Rows_sent: 1000 Rows_examined: 1481669\nSET timestamp=1493217907;\nSELECT mcu.mcu_guid, mcu.cus_guid, mcu.mcu_url, mcu.mcu_crawlelements, mcu.mcu_order, GROUP_CONCAT(mca.mca_guid SEPARATOR \";\") as mca_guid\n FROM kat_mailcustomerurl mcu, kat_customer cus, kat_mailcampaign mca\n WHERE cus.cus_guid = mcu.cus_guid\n \tAND cus.pro_code = 'CYB'\n \tAND cus.cus_offline = 0\n \tAND mca.cus_guid = cus.cus_guid\n \tAND (mcu.mcu_date IS NULL OR mcu.mcu_date < CURDATE())\n \tAND mcu.mcu_crawlelements IS NOT NULL\n GROUP BY mcu.mcu_guid\n ORDER BY mcu.mcu_order ASC\n LIMIT 1000;]

When I copy/paste my logs into http://grokconstructor.appspot.com using the grok pattern found in filebeat/module/mysql/slowlog/ingest/pipeline.json, it works. But for a reason I don't understand, filebeat fails. :sweat:

Thanks for your help

It seems the mysql log on your side does not match the one we use for testing.

  • Which mysql version are you using?
  • Could you provide some slow log examples?

I am using Mysql 5.7.17 on debian

here's an extract :

# Time: 2017-04-28T09:07:39.777791Z
# User@Host: apphost[apphost] @ apphost [ip]  Id: 10997316
# Query_time: 4.071491  Lock_time: 0.000212 Rows_sent: 1000  Rows_examined: 1489615
SET timestamp=1493370459;
SELECT mcu.mcu_guid, mcu.cus_guid, mcu.mcu_url, mcu.mcu_crawlelements, mcu.mcu_order, GROUP_CONCAT(mca.mca_guid SEPARATOR ";") as mca_guid
                    FROM kat_mailcustomerurl mcu, kat_customer cus, kat_mailcampaign mca
                    WHERE cus.cus_guid = mcu.cus_guid
                        AND cus.pro_code = 'CYB'
                        AND cus.cus_offline = 0
                        AND mca.cus_guid = cus.cus_guid
                        AND (mcu.mcu_date IS NULL OR mcu.mcu_date < CURDATE())
                        AND mcu.mcu_crawlelements IS NOT NULL
                    GROUP BY mcu.mcu_guid
                    ORDER BY mcu.mcu_order ASC
                    LIMIT 1000;
# Time: 2017-04-28T09:16:30.738365Z
# User@Host: apphost[apphost] @ apphost [ip]  Id: 10999834
# Query_time: 10.346539  Lock_time: 0.000036 Rows_sent: 0  Rows_examined: 4751313
SET timestamp=1493370990;
call load_stats(1, '2017-04-28 00:00:00');
# Time: 2017-04-28T09:31:31.133657Z
# User@Host: apphost[apphost] @ apphost [ip]  Id: 11004208
# Query_time: 10.508030  Lock_time: 0.000034 Rows_sent: 0  Rows_examined: 4754675
SET timestamp=1493371891;
call load_stats(1, '2017-04-28 00:00:00');
# Time: 2017-04-28T09:32:41.128590Z
# User@Host: apphost[apphost] @ apphost [ip]  Id: 11004559
# Query_time: 4.231050  Lock_time: 0.000211 Rows_sent: 1000  Rows_examined: 1489498
SET timestamp=1493371961;
SELECT mcu.mcu_guid, mcu.cus_guid, mcu.mcu_url, mcu.mcu_crawlelements, mcu.mcu_order, GROUP_CONCAT(mca.mca_guid SEPARATOR ";") as mca_guid
                    FROM kat_mailcustomerurl mcu, kat_customer cus, kat_mailcampaign mca
                    WHERE cus.cus_guid = mcu.cus_guid
                        AND cus.pro_code = 'CYB'
                        AND cus.cus_offline = 0
                        AND mca.cus_guid = cus.cus_guid
                        AND (mcu.mcu_date IS NULL OR mcu.mcu_date < CURDATE())
                        AND mcu.mcu_crawlelements IS NOT NULL
                    GROUP BY mcu.mcu_guid
                    ORDER BY mcu.mcu_order ASC
                    LIMIT 1000;
# Time: 2017-04-28T09:38:58.136628Z
# User@Host: apphost[apphost] @ apphost [ip]  Id: 11006302
# Query_time: 4.057573  Lock_time: 0.000219 Rows_sent: 1000  Rows_examined: 1489519
SET timestamp=1493372338;
SELECT mcu.mcu_guid, mcu.cus_guid, mcu.mcu_url, mcu.mcu_crawlelements, mcu.mcu_order, GROUP_CONCAT(mca.mca_guid SEPARATOR ";") as mca_guid
                    FROM kat_mailcustomerurl mcu, kat_customer cus, kat_mailcampaign mca
                    WHERE cus.cus_guid = mcu.cus_guid
                        AND cus.pro_code = 'CYB'
                        AND cus.cus_offline = 0
                        AND mca.cus_guid = cus.cus_guid
                        AND (mcu.mcu_date IS NULL OR mcu.mcu_date < CURDATE())
                        AND mcu.mcu_crawlelements IS NOT NULL
                    GROUP BY mcu.mcu_guid
                    ORDER BY mcu.mcu_order ASC
                    LIMIT 1000;
# Time: 2017-04-28T09:46:31.124448Z
# User@Host: apphost[apphost] @ apphost [ip]  Id: 11008346
# Query_time: 10.721687  Lock_time: 0.000054 Rows_sent: 0  Rows_examined: 4757952
SET timestamp=1493372791;
call load_stats(1, '2017-04-28 00:00:00');

I just changed the hostname and the server IP.
But as I said, the grok pattern seems to work when I'm trying it on a online tester. That's what's strange :-/

Ok so now it seems to read the slow-log queries. I didn't do anything, maybe it's okay with the new logs added to the file... :confused:

Edit: my bad, only several queries are read (and not the new ones). So the problem is still there.

It seems our grok pattern has issues with the following content:

# User@Host: apphost[apphost] @ apphost [ip]  Id: 10997316
# Query_time: 4.071491  Lock_time: 0.000212 Rows_sent: 1000  Rows_examined: 1489615
SET timestamp=1493370459;

Our grok pattern looks as following:

^# User@Host: %{USER:mysql.slowlog.user}(\\[[^\\]]+\\])? @ %{HOSTNAME:mysql.slowlog.host} \\[(IP:mysql.slowlog.ip)?\\](\\s*Id:\\s* %{NUMBER:mysql.slowlog.id})?\n# Query_time: %{NUMBER:mysql.slowlog.query_time.sec}\\s* Lock_time: %{NUMBER:mysql.slowlog.lock_time.sec}\\s* Rows_sent: %{NUMBER:mysql.slowlog.rows_sent}\\s* Rows_examined: %{NUMBER:mysql.slowlog.rows_examined}\n(SET timestamp=%{NUMBER:mysql.slowlog.timestamp};\n)?%{GREEDYMULTILINE:mysql.slowlog.query}

I need to check it in detail where it doesn't match. @tudor perhaps knows?

Okay, I hope you'll find :slight_smile:

Remember that I wrote "apphost" and "ip" myself, in the real logs it's more something like "app_hostname" and "ns123456789.ip-1-11-11.eu".
In case that's the problem...

I assume it is an ipv4 address?

Yes, excuse me I forgot the IP.

Which filebeat version are you using? Which elasticsearch version? What are the elasticsearch plugins you have installed?

I created a PR here to test the file. So far I could not reproduce it but lets see what CI thinks: https://github.com/elastic/beats/pull/4181

Update: I managed to reproduce it.

Thank you @ruflin and @gnujeremie for investigating this. I opened https://github.com/elastic/beats/pull/4183 with a fix.

You can manually make the same modification in your module/mysql/slowlog/ingest/pipeline.json file if you're interested to hotpatch this until we release it.

Sorry for not answering sooner, I was on hollydays last week :slight_smile:

I'm using ES 5.3.1 and filebeats 5.3.1. Only ES plugin is XPack.

But it seems like you managed to find the issue, cool !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.