Fields are merge grok output

I have a log like this:

2020-05-22T11:45:21.297418 H 20 mytest.cpp:175 stdlog sql_execute 38533 73 testdb admin 100-xxy {"query_str","client","execution_time_ms","total_time_ms"}
{"select * from mydatabase;","tcp:localhost:4000","27","73"}

My Grok pattern:
filter {
grok {
match => { "message" => "%{GREEDYDATA:timex} %{WORD:Ix} %{NUMBER:nox} %{GREEDYDATA:code} %{WORD:stdlog} %{WORD:type} %{NUMBER:numbery} %{NUMBER:noh} %{WORD:dbtype} %{WORD:loguserx} %{GREEDYDATA:sessionID} %{GREEDYDATA:query_field} %{GREEDYDATA:query_stats}" }
}
}

problem 01:
output:
My result:
sessionID field contains with part of query_field result, and query_field contains with part of the query_stats results.

sessionID result=> 731-ufsN {"myquery","source","time","total_time"} {"select *

query_field result=> from
query_stats result=> db_states;","tcp:myhost:12336","10","15"}

expected result:
sessionID result=> 731-ufsN
query_field result=> {"myquery","client","time","total_time"}
query_stats result=>{"select * db_states;","tcp:myhost:12336","10","15"}

Problem 02:
how can I get each value inside the {"select * db_states;","tcp:myhost:12336","10","15" }.

Any help with this??

Hi,

Try this pattern.

%{TIME:timex} %{WORD:Ix} %{NUMBER:nox} (?<code>[^\s]*) %{WORD:stdlog} %{WORD:type} %{NUMBER:numbery} %{NUMBER:noh} %{WORD:dbtype} %{WORD:loguserx} (?<sessionID>[^\s]*) {(?<query_field>[^}]*)}{(?<query_stats>[^}]*)}$

Thanks Pup_seba, Actually I got my required output just changing the data pattern from GREEDYDATA to DATA in sessionID, query_field fileds. Not working on splitting the values to different fields like {"myquery","source","time","total_time"}.

Anyway I will try your solution in my code as well. Thanks a lot.

Hi Pup_seba,
Do let me know, how can I get different field for this result. select query changing every time, like will be more complex time to time (including inner join, join). Need to create graphs using time, total_time fields.

{"select * db_states;","tcp:myhost:12336","10","15"}

My pattern like this. But did not work.

filter {
grok {
match => { "message" => "%{GREEDYDATA:currenttimex} %{WORD:Ix} %{NUMBER:no16} %{GREEDYDATA:code} %{WORD:stdlog} %{WORD:type} %{NUMBER:number38} %{NUMBER:no73} %{WORD:dbtype} %{WORD:loguser} %{DATA:sessionID} %{DATA:query_topic} %{GREEDYDATA:query_stats}" }
}
}

filter {
grok {
match => { "query_stats" => "%{GREEDYDATA:sql} %{GREEDYDATA:client} %{NUMBER:time} %{NUMBER:total_time}" }
}
}

Hi himalc :slight_smile:

I think I did not understand your initial expected output then. So, I guess that instead of this:

sessionID result=> 731-ufsN
query_field result=> {"myquery","client","time","total_time"}
query_stats result=>{"select * db_states;","tcp:myhost:12336","10","15"}

You actually expect this output:

sessionID result=> 731-ufsN
query_str => myquery
client => client
execution_time_ms => time
total_time_ms => total_time
query_stats => {"select * db_states;","tcp:myhost:12336","10","15"}
query_field => "myquery" + "client" + "time" + "total_time"

If this is the case, then I guess you need a grok filter AND a mutate filter. The grok would look like this:

%{TIME:timex} %{WORD:Ix} %{NUMBER:nox} (?<code>[^\s]*) %{WORD:stdlog} %{WORD:type} %{NUMBER:numbery} %{NUMBER:noh} %{WORD:dbtype} %{WORD:loguserx} (?<sessionID>[^\s]*) {(?<query_str>[^,]*),(?<client>[^,]*),(?<execution_time_ms>[^,]*),(?<total_time_ms>[^}]*)}{(?<query_stats>[^}]*)}$

Then, you would need to "construct" the query_stats field, and this is where you use the "mutate" filter. I can't test it right now, but this post could help you Adding a field from existing ones.

The formula I'm using in the grok is quite easy and is always the same. Basically I use some literals to "pinpoint" somethings (look at the commas for instance, those are just literal matches).
Then I use this extended group (?subexp) over and over again :slight_smile: You can see more info about that here: https://github.com/kkos/oniguruma/blob/master/doc/RE

Then is only a regular expression where I just match "any character until (^) the character which in most cases is a comma. Then I just use a quantifier () so I match "all the characters" until the negation. For this you'll see "(?<field_name>[^,]).

Hope this helped you.

Hi Pup_seba,
Thanks your prompt reply. Unfortunately, above grok pattern does not work for me.
To be clarify, My final expectation is to get "sql query string","Time", "total time" and "client" values as a separate fields for generating graphs. Below is my log pattern (These values : {"select * from mydatabase;","tcp:localhost:4000","27","73"} ).

In this case,
query string is = "select * from mydatabase
client = tcp:localhost:4000
time = 27
total time = 73

log file:
2020-05-22T11:45:21.297418 H 20 mytest.cpp:175 stdlog sql_execute 38533 73 testdb admin 100-xxy {"query_str","client","execution_time_ms","total_time_ms"}
{"select * from mydatabase;","tcp:localhost:4000","27","73"}

For this log:

2020-05-22T11:45:21.297418 H 20 mytest.cpp:175 stdlog sql_execute 38533 73 testdb admin 100-xxy {"select * from mydatabase;","tcp:localhost:4000","27","73"}

Using this grok filter:

%{TIME:timex} %{WORD:Ix} %{NUMBER:nox} (?<code>[^\s]*) %{WORD:stdlog} %{WORD:type} %{NUMBER:numbery} %{NUMBER:noh} %{WORD:dbtype} %{WORD:loguserx} (?<sessionID>[^\s]*) {"(?<query_str>[^"]*)","(?<client>[^"]*)","(?<execution_time_ms>[\d]*)","(?<total_time_ms>[\d]*)"}$

This is the output:

{
  "code": "mytest.cpp:175",
  "total_time_ms": "73",
  "noh": "73",
  "query_str": "select * from mydatabase;",
  "sessionID": "100-xxy",
  "type": "sql_execute",
  "stdlog": "stdlog",
  "Ix": "H",
  "numbery": "38533",
  "loguserx": "admin",
  "nox": "20",
  "execution_time_ms": "27",
  "dbtype": "testdb",
  "client": "tcp:localhost:4000",
  "timex": "11:45:21.297418"
}

Hi Sebastian,
Thanks for the code. But apply this my logstash was shutdown.

Seems 1 field set missing above code.
2020-05-22T11:45:21.297418 H 20 mytest.cpp:175 stdlog sql_execute 38533 73 testdb admin 100-xxy {"query_str","client","execution_time_ms","total_time_ms"}
{"select * from mydatabase;","tcp:localhost:4000","27","73"}

I have updated it but still encountered an error. Check my Grok below. Let me know if you found any error here.

match => { "message" => "%{GREEDYDATA:currenttimex} %{WORD:Ix} %{NUMBER:no16} %{GREEDYDATA:code} %{WORD:stdlog} %{WORD:type} %{NUMBER:number38} %{NUMBER:no73} %{WORD:dbtype} %{WORD:loguser} %{DATA:sessionID} %{DATA:query_topic} %{"(?<query_str>[^"])","(?[^"])","(?<execution_time_ms>[\d])","(?<total_time_ms>[\d])"}$" }

log file:
2020-05-22T11:45:21.297418 H 20 mytest.cpp:175 stdlog sql_execute 38533 73 testdb admin 100-xxy {"query_str","client","execution_time_ms","total_time_ms"}
{"select * from mydatabase;","tcp:localhost:4000","27","73"}

This my Grok pattern

filter {
grok {
match => { "message" => "%{TIME:timex} %{WORD:Ix} %{NUMBER:nox} (?[^\s]) %{WORD:stdlog} %{WORD:type} %{NUMBER:numbery} %{NUMBER:noh} %{WORD:dbtype} %{WORD:loguserx} (?[^\s]) %{DATA:query_topic} {"(?<query_str>[^"])","(?[^"])","(?<execution_time_ms>[\d])","(?<total_time_ms>[\d])"}$" }
}
}

Hi Sebastian,

Do you have any idea of this issue.

Thanks.