Logstash grok pattern question when sql statement has commas in it

I am trying to get a grok pattern to match the following and i have it besides the fact that i had to leave a field out to get it done. this is postgresql csv pattern and i needed to splice out the duration from the query. you can see that i did this by having a duration field and then the statement field ending with a double comma ,, anyone know how to not have to do a ,, and to be able to get the duration and query statement split out? between the ,, should be a field named "detail" which would be empty in this case. right now i just don't have a detail field but would like to have it.

2020-04-07 22:32:04.673 UTC,"app_keystore","keystore",14220,"[local]",5e8cf4b2.378c,1748710,"SELECT",2020-04-07 21:46:26 UTC,5/0,0,LOG,00000,"duration: 0.148 ms  statement: SELECT ""keystore_app_keyconfig"".""name"", ""keystore_app_keyconfig"".""usage"", ""keystore_app_keyconfig"".""lifetime"", ""keystore_app_keyconfig"".""band_id"", ""keystore_app_keyconfig"".""rotation_method"", ""keystore_app_keyconfig"".""seg_size"", ""keystore_app_keyconfig"".""rotation_time"", ""keystore_app_keyconfig"".""pre_create_keys"", ""keystore_app_keyconfig"".""notes"" FROM ""keystore_app_keyconfig"" WHERE ""keystore_app_keyconfig"".""name"" = 'qa_unit_tests'",,,,,,,,,"app - 10.124.193.84:33116"


(%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{TZ:time_zone})?,(%{DATA:user_name})?,(%{DATA:database_name})?,(%{NUMBER:process_id})?,("\[%{DATA:connection_from}\]")?,(%{USERNAME:session_id})?,(%{NUMBER:session_line_num})?,("%{DATA:command_tag}")?,(%{TIMESTAMP_ISO8601:timestamp2}%{SPACE}%{TZ:time_zone2})?,(%{DATA:virtual_transaction_id})?,(%{DATA:transaction_id})?,(%{DATA:error_severity})?,(%{NUMBER:sql_state_code})?,"duration:%{SPACE}%{NUMBER:duration}%{SPACE}ms%{SPACE}statement:%{SPACE}%{DATA:statement}",,(%{DATA:hint})?,(%{DATA:internal_query})?,(%{DATA:internal_query_pos})?,(%{DATA:context})?,(%{DATA:query})?,(%{DATA:query_pos})?,(%{DATA:location})?,(%{GREEDYDATA:application_name})?

the output from grok debugger is:

{
  "process_id": "14220",
  "time_zone2": "UTC",
  "error_severity": "LOG",
  "sql_state_code": "00000",
  "user_name": "\"app_keystore\"",
  "timestamp2": "2020-04-07 21:46:26",
  "command_tag": "SELECT",
  "query_pos": "",
  "duration": "0.148",
  "application_name": "\"app - 10.124.193.84:33116\"",
  "statement": "SELECT \"\"keystore_app_keyconfig\"\".\"\"name\"\", \"\"keystore_app_keyconfig\"\".\"\"usage\"\", \"\"keystore_app_keyconfig\"\".\"\"lifetime\"\", \"\"keystore_app_keyconfig\"\".\"\"band_id\"\", \"\"keystore_app_keyconfig\"\".\"\"rotation_method\"\", \"\"keystore_app_keyconfig\"\".\"\"seg_size\"\", \"\"keystore_app_keyconfig\"\".\"\"rotation_time\"\", \"\"keystore_app_keyconfig\"\".\"\"pre_create_keys\"\", \"\"keystore_app_keyconfig\"\".\"\"notes\"\" FROM \"\"keystore_app_keyconfig\"\" WHERE \"\"keystore_app_keyconfig\"\".\"\"name\"\" = 'qa_unit_tests'",
  "context": "",
  "internal_query": "",
  "timestamp": "2020-04-07 22:32:04.673",
  "transaction_id": "0",
  "database_name": "\"keystore\"",
  "query": "",
  "session_id": "5e8cf4b2.378c",
  "session_line_num": "1748710",
  "time_zone": "UTC",
  "internal_query_pos": "",
  "virtual_transaction_id": "5/0",
  "connection_from": "local",
  "hint": "",
  "location": ""
}

currently looks like:

%{NUMBER:duration}%{SPACE}ms%{SPACE}statement:%{SPACE}%{DATA:statement}",,(%{DATA:hint})?

changed to this:

%{NUMBER:duration}%{SPACE}ms%{SPACE}statement:%{SPACE}%{DATA:statement}",(%{DATA:detail})?,(%{DATA:hint})?

output looks like this but it is now all jumbled:

{
  "process_id": "14220",
  "time_zone2": "UTC",
  "error_severity": "LOG",
  "sql_state_code": "00000",
  "user_name": "\"app_keystore\"",
  "timestamp2": "2020-04-07 21:46:26",
  "command_tag": "SELECT",
  "query_pos": " \"\"keystore_app_keyconfig\"\".\"\"pre_create_keys\"\"",
  "duration": "0.148",
  "application_name": ",,,,,,,,\"app - 10.124.193.84:33116\"",
  "statement": "SELECT \"\"keystore_app_keyconfig\"\".\"\"name\"",
  "context": " \"\"keystore_app_keyconfig\"\".\"\"seg_size\"\"",
  "internal_query": " \"\"keystore_app_keyconfig\"\".\"\"band_id\"\"",
  "timestamp": "2020-04-07 22:32:04.673",
  "transaction_id": "0",
  "database_name": "\"keystore\"",
  "query": " \"\"keystore_app_keyconfig\"\".\"\"rotation_time\"\"",
  "session_id": "5e8cf4b2.378c",
  "session_line_num": "1748710",
  "time_zone": "UTC",
  "internal_query_pos": " \"\"keystore_app_keyconfig\"\".\"\"rotation_method\"\"",
  "virtual_transaction_id": "5/0",
  "connection_from": "local",
  "hint": " \"\"keystore_app_keyconfig\"\".\"\"lifetime\"\"",
  "location": " \"\"keystore_app_keyconfig\"\".\"\"notes\"\" FROM \"\"keystore_app_keyconfig\"\" WHERE \"\"keystore_app_keyconfig\"\".\"\"name\"\" = 'qa_unit_tests'\"",
  "detail": " \"\"keystore_app_keyconfig\"\".\"\"usage\"\""
}

having the commas in the sql statement throws this grok pattern off and not sure how to get around this so that i capture all of the fields.

i was able to figure it out. i just needed to use greedydata in my statement field and was then able to add my detail field.

(%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{TZ:time_zone})?,(%{DATA:user_name})?,(%{DATA:database_name})?,(%{NUMBER:process_id})?,("\[%{DATA:connection_from}\]")?,(%{USERNAME:session_id})?,(%{NUMBER:session_line_num})?,("%{DATA:command_tag}")?,(%{TIMESTAMP_ISO8601:timestamp2}%{SPACE}%{TZ:time_zone2})?,(%{DATA:virtual_transaction_id})?,(%{DATA:transaction_id})?,(%{DATA:error_severity})?,(%{NUMBER:sql_state_code})?,"duration:%{SPACE}%{NUMBER:duration}%{SPACE}ms%{SPACE}statement:%{SPACE}%{GREEDYDATA:statement}",(%{DATA:details})?,(%{DATA:hint})?,(%{DATA:internal_query})?,(%{DATA:internal_query_pos})?,(%{DATA:context})?,(%{DATA:query})?,(%{DATA:query_pos})?,(%{DATA:location})?,(%{GREEDYDATA:application_name})?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.