Hello everyone, as the caption indicates I am trying to match multiple-lines of logs of which I would like to save first occurence of timestamp to separate field than last occurence of the timestamp. I am using codec multiline to separate tasks from big log file and (?m) regex before timestamp. I need to do this to measure time between process completion. Any ideas how this could be done ?
2020-12-16 15:43:31.605 INFO 18020 --- [http-nio-8080-exec-3] c.n.w.workflow.service.DataService : Getting groups of task 5fda1d109ceec746643760f8 in case 11.11.2020 13:20 level: 0
2020-12-16 15:43:34.346 INFO 18020 --- [http-nio-8080-exec-1] c.n.w.workflow.service.TaskService : [5fda1d109ceec746643760f5]: Task [GENERATE] in case [11.11.2020 13:20] assigned to [super@netgrif.com] was finished
I would like the output to be Start: 15:43:31.605 Severity: INFO INT: 18020 Thread: http-nio-8080-exec-3 Class: c.n.w.workflow.service.TaskService GREEDYDATA: [...,...,...] End: 15:43:34.346
I was thinking of removing all items from array of timestamps generated by (?m) between first and last and then separate it into two different fields but Im not really sure how to do so.
grok { match => { "message" => "\A%{TIMESTAMP_ISO8601:start}.*^%{TIMESTAMP_ISO8601:end}[^\n]*\Z" } }
\A anchors the first timestamp to the beginning of the message field. Then for the end time you anchor it to start of line using ^ and the [^\n]*\Z means there cannot be another newline from there to the very end of the text.
I just tried to use it incorporated into my pattern , but im not quite getting the output that I would like to be getting from GREEDYDATA. For some reason im getting single value rather than array of values from all the GREEDYDATA values.
My entire pattern : (?m)\A%{TIMESTAMP_ISO8601:start}.* %{SPACE} %{LOGLEVEL:LEVEL} %{INT:NUMBER} --{2} \[%{DATA:THREAD}] %{DATA:CLASS}\s(?m)%{GREEDYDATA:message}^%{TIMESTAMP_ISO8601:end}[^\n]*\Z
Is my usage correct ?
I thought (?m) before greedy data indicates multiline input and that would result into an array of values.
grok will not return an array of matches. If you need multiple matches for a single pattern then use a ruby filter and the String .scan function. There is an example of doing that here.
Sorry I didnt express myself correctly. I would be totally fine with the output that is in top answer here in the field "extralines". I thought this could be done with simple (?m) usage.
"message" => "2020-12-16 15:43:31.605 INFO 18020 --- [http-nio-8080-exec-3] c.n.w.workflow.service.DataService : Getting groups of task 5fda1d109ceec746643760f8 in case 11.11.2020 13:20 level: 0\nFoo\n Bar\n2020-12-16 15:43:34.346 INFO 18020 --- [http-nio-8080-exec-1] c.n.w.workflow.service.TaskService : [5fda1d109ceec746643760f5]: Task [GENERATE] in case [11.11.2020 13:20] assigned to [super@netgrif.com] was finished"
"end" => "2020-12-16 15:43:34.346",
"CLASS" => "c.n.w.workflow.service.DataService",
"LEVEL" => "INFO",
"message" => [
[0] "2020-12-16 15:43:31.605 INFO 18020 --- [http-nio-8080-exec-3] c.n.w.workflow.service.DataService : Getting groups of task 5fda1d109ceec746643760f8 in case 11.11.2020 13:20 level: 0\nFoo\n Bar\n2020-12-16 15:43:34.346 INFO 18020 --- [http-nio-8080-exec-1] c.n.w.workflow.service.TaskService : [5fda1d109ceec746643760f5]: Task [GENERATE] in case [11.11.2020 13:20] assigned to [super@netgrif.com] was finished",
[1] " : Getting groups of task 5fda1d109ceec746643760f8 in case 11.11.2020 13:20 level: 0\nFoo\n Bar\n"
],
etc. Note that the array of arrays is just a presentation thing at https://grokdebug.herokuapp.com/. Even in that SO answer you linked to, the GREEDYDATA is matching a single string.
If you want to get each line in a separate array entry then use mutate+split.
sorry for the screenshot I hope its good enough quality to see the content.
I copied your input message and pattern just to be 100% sure im not screwing up anywhere myself. Thanks for the tip for mutate + split I will certainly do that after I figure this out.
Sorry i re-read your answer and I thought I saw something different in output , I get why Im getting a single value now. Is it possible for me to get ALL the GREEDYDATA values in a single field ? Thanks:)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.