Jdbc log parsing

Hi All,

I would like to parse a jdbc log and I'm just learning the process. The log looks like this:

40703|master|180615-09:36:22:315|7 ms|statement|UPDATE tasks SET p_runningonclusternode = ? WHERE p_runningonclusternode = ? AND PK = ? AND hjmpTS = ? |UPDATE tasks SET p_runningonclusternode = 0 WHERE p_runningonclusternode = -1 AND PK = 8796125856694 AND hjmpTS = 1076364
40703|master|180615-09:36:22:322|5 ms|statement|UPDATE taskconditions SET p_consumed = ? WHERE p_task = ?|UPDATE taskconditions SET p_consumed = 1 WHERE p_task = 8796125856694
40703|master|180615-09:36:22:328|1 ms|statement|SELECT item_t0.PK FROM tasks item_t0 WHERE ( item_t0.p_trigger = 8796125856246) AND (item_t0.TypePkString=? )|SELECT item_t0.PK FROM tasks item_t0 WHERE ( item_t0.p_trigger = 8796125856246) AND (item_t0.TypePkString=8796107178066 )
40703|master|180615-09:36:22:333|2 ms|statement|UPDATE triggerscj SET hjmpTS = ? ,modifiedTS=?,p_activationtime=? WHERE PK = ? AND (sealed IS NULL OR sealed=0)|UPDATE triggerscj SET hjmpTS = 538183 ,modifiedTS='2018-06-15 09:36:22.329',p_activationtime='2018-06-15 09:36:23.0' WHERE PK = 8796125856246 AND (sealed IS NULL OR sealed=0)
40703|master|180615-09:36:22:337|2 ms|statement|UPDATE tasks SET hjmpTS = ? ,modifiedTS=?,p_executiontimemillis=? WHERE PK = ? AND (sealed IS NULL OR sealed=0)|UPDATE tasks SET hjmpTS = 1076365 ,modifiedTS='2018-06-15 09:36:22.334',p_executiontimemillis=1529080583000 WHERE PK = 8796125856694 AND (sealed IS NULL OR sealed=0)
40703|master|180615-09:36:22:341|3 ms|commit||
40703|master|180615-09:36:22:344|1 ms|statement|SELECT * FROM triggerscj WHERE PK=?|SELECT * FROM triggerscj WHERE PK=8796125856246
40703|master|180615-09:36:22:347|1 ms|statement|SELECT * FROM tasks WHERE PK=?|SELECT * FROM tasks WHERE PK=8796125856694
40703|master|180615-09:36:22:354|5 ms|statement|UPDATE tasks SET hjmpTS = ? ,modifiedTS=?,p_retry=? WHERE PK = ? AND (sealed IS NULL OR sealed=0)|UPDATE tasks SET hjmpTS = 1076366 ,modifiedTS='2018-06-15 09:36:22.348',p_retry=538183 WHERE PK = 8796125856694 AND (sealed IS NULL OR sealed=0)
40703|master|180615-09:36:22:360|4 ms|statement|UPDATE tasks SET p_runningonclusternode = ? WHERE PK = ?|UPDATE tasks SET p_runningonclusternode = -1 WHERE PK = 8796125856694
40703|master|180615-09:36:22:364|2 ms|statement|DELETE FROM tasks_aux_queue WHERE PK IN (?)|DELETE FROM tasks_aux_queue WHERE PK IN ('8796125856694')
40703|master|180615-09:36:22:371|6 ms|commit||
61|master|180615-09:36:22:386|0 ms|statement|SELECT isInitialized FROM metainformations WHERE PK = ?|SELECT isInitialized FROM metainformations WHERE PK = 15481123719086096
61|master|180615-09:36:22:389|1 ms|statement|SELECT * FROM tasks WHERE PK=?|SELECT * FROM tasks WHERE PK=8796125856694
38|master|180615-09:36:22:791|4 ms|statement|DELETE FROM tasks_aux_workers WHERE ID = ? AND IS_ACTIVE = 0|DELETE FROM tasks_aux_workers WHERE ID = 0 AND IS_ACTIVE = 0

Looking at the fields it looks like they are:

PID | Node? | timestamp | duration | command type? | sql statement | additional sql statement?

Does anyone know how I can confirm what each field corresponds to? The first field appears like it could be PID but numbers less than 100 seem odd. Thoughts?

Any guidance on how I can get started on creating the logstash input block?

Thanks in advance..

HB

Hi All,

I've gathered more information and am able to understand what this JDBC log is outputting. For a given log line:

79|master|180615-09:36:22:287|1 ms|statement|INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= ? AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)|INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= '1529080582279' AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)

Here are the fields:

thread id|datasource id|start time|elapsed time|category|prepared statement|sql statement

The LS parsing I'm attempting (and bare with me, I'm a novice with this) is:

grok {
match => { "message" => "%{NUMBER:thread_id}\|%{WORD:datasource_id}\|%{DATA:timestamp}\|%{NUMBER:duration} %{WORD:interval}\|%{WORD:category}\|%{GREEDYDATA:prepared_statement}\|%{GREEDYDATA:sql_statement}" }
}

date { match => ["timestamp", "yyMMdd-HH:mm:ss:SSS"] }

If I run the above log line in the Kibana grok debugger it appears to work fine:

{
"duration": "1",
"thread_id": "79",
"datasource_id": "master",
"prepared_statement": "INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= ? AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)",
"sql_statement": "INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()
(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= '1529080582279' AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)",
"interval": "ms",
"category": "statement",
"timestamp": "180615-09:36:22:287"
}

It also runs fine with run it to stdout: output {stdout { codec => rubydebug }}.

However, running it through LS I it does not work properly. It gives _grokparsefailure tag.

I am pretty sure the problem is with the GREEDYDATA syntax but I'm not knowledgeable about how to fix it. The log file fields appear to be separated by pipes "|" with no spaces universally, can anyone assist with might be going on here?

TIA,

HB

Is there anyone in the community that can point me in the right direction here?

Maybe just how to trouble shoot?

Thanks,

HB

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.