Jdbc log parsing

Hi All,

I've gathered more information and am able to understand what this JDBC log is outputting. For a given log line:

79|master|180615-09:36:22:287|1 ms|statement|INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= ? AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)|INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= '1529080582279' AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)

Here are the fields:

thread id|datasource id|start time|elapsed time|category|prepared statement|sql statement

The LS parsing I'm attempting (and bare with me, I'm a novice with this) is:

grok {
match => { "message" => "%{NUMBER:thread_id}\|%{WORD:datasource_id}\|%{DATA:timestamp}\|%{NUMBER:duration} %{WORD:interval}\|%{WORD:category}\|%{GREEDYDATA:prepared_statement}\|%{GREEDYDATA:sql_statement}" }
}

date { match => ["timestamp", "yyMMdd-HH:mm:ss:SSS"] }

If I run the above log line in the Kibana grok debugger it appears to work fine:

{
"duration": "1",
"thread_id": "79",
"datasource_id": "master",
"prepared_statement": "INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= ? AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)",
"sql_statement": "INSERT INTO tasks_aux_queue (PK, RANGE_VALUE, NODE_ID, NODE_GROUP, EXECUTION_TIME) SELECT * FROM (SELECT PK, floor(0 + (rand()
(1000-0))) AS rangeCol, CASE WHEN p_nodeId IS NULL THEN -1 ELSE p_nodeId END AS nodeIdCol, CASE WHEN p_nodeGroup IS NULL THEN '---' ELSE p_nodeGroup END AS nodeGroupCol, p_expirationtimemillis/1000/60 AS execTimeCol FROM tasks WHERE p_failed = 0 AND p_expirationtimemillis <= '1529080582279' AND p_runningonclusternode = -1) AS A WHERE NOT EXISTS (SELECT 1 FROM tasks_aux_queue AS B WHERE A.PK = B.PK)",
"interval": "ms",
"category": "statement",
"timestamp": "180615-09:36:22:287"
}

It also runs fine with run it to stdout: output {stdout { codec => rubydebug }}.

However, running it through LS I it does not work properly. It gives _grokparsefailure tag.

I am pretty sure the problem is with the GREEDYDATA syntax but I'm not knowledgeable about how to fix it. The log file fields appear to be separated by pipes "|" with no spaces universally, can anyone assist with might be going on here?

TIA,

HB