When creating analytics (for kibana) I need to create keywords
Because unfortunately simple text cannot be analyzed in kibana Dashboards.
So I am trying to break up the logs.
Grok constructor works fine when the logs have a specific format. But what if there are many different formats?
For example using something like if/else for many different patterns?
Is this the recommended way to do it ?
It does not seem to work
Always have the patterns ordered so that a more specific message matches before a less specific. And always anchor your messages if possible, so that a match fails very cheaply. All four of those patterns should probably start with ^ to anchor them to the start of [message].
I don't think you need break_on_match in this case. If you are matching against several different log formats it is best to stop as soon as you get a match. If you are matching against several different patterns each of which matches part of a log message then yes, setting that is essential.
Thanks for all the answers.
Here are four log formats that will be used as input.
As output I want some keywords if available (e.g "timestamp", "log_level" that are present in all logs, "calling_server" that is available only in one log etc). Note there are going to be even more diverse logs (eg java exceptions)
The last log will have most keywords (so this will be put first). Then the other ones will partially match (timestamp, loglevel, ID_number). So what I need is a complete match of the last one and from there on I would like to break up the messages as much as possible. If nothing matches (eg java exceptions) I am going to keep it as a whole message (text)
2022-05-27 16:57:40.057 INFO [exmp-docgen-srv,613863e75eb43d9c,613863e75eb43d9c] 2680242 --- [http-nio-8386-exec-3] o.keycloak.adapters.KeycloakDeployment : [applicationRequestID=] Loaded URLs from http://exmp-auth.exmp.local:5000/auth/realms/exmp-dev/.well-known/openid-configuration
2022-05-27 16:57:49.121 WARN [exmp-docgen-srv,613863e75eb43d9c,613863e75eb43d9c] 2680242 --- [http-nio-8386-exec-3] o.a.c.util.SessionIdGeneratorBase : [applicationRequestID=74967d30-58d1-4aa9-860c-04963ac24917] Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [359] milliseconds.
2022-05-27 16:57:51.710 INFO [exmp-docgen-srv,613863e75eb43d9c,613863e75eb43d9c] 2680242 --- [http-nio-8386-exec-3] g.u.m.d.c.DocumentGenerationController : [applicationRequestID=74967d30-58d1-4aa9-860c-04963ac24917] Generate MSWord document called with template:exmp-document-generation-sample-template.docx
2022-09-20 16:08:20.874 ERROR 3608769 --- [scheduling-1] g.u.m.i.s.q.QueueItemHandlingService : An exception occured: No results for path: $['unit3Descr']
2022-09-20 16:08:35.889 INFO 3608769 --- [scheduling-1] g.u.m.i.clients.RestClientFilters : Request: Method: GET, URL: http://exmp-index.exmp.local:8381/camel/exmp/unit2/34554
2022-09-20 16:08:35.988 INFO 3608769 --- [reactor-http-epoll-3] g.u.m.i.clients.RestClientFilters : Response: 200 OK
2022-09-20 16:08:36.005 ERROR 3608769 --- [scheduling-1] g.u.m.i.s.q.QueueItemHandlingService : An exception occured: No results for path: $['unit3Descr']
2022-09-22 13:46:19.479 INFO 688857 --- [http-nio-8384-exec-5] o.s.c.c.s.e.NativeEnvironmentRepository : Adding property source: Config resource 'file [/opt/applications/exmp-configuration-service-configs/exmp-jbpm-configuration-mapping-dev.properties]' via location 'file:///opt/applications/exmp-configuration-service-configs/'
2022-09-22 13:46:19.485 INFO 688857 --- [http-nio-8384-exec-6] o.s.c.c.s.e.NativeEnvironmentRepository : Adding property source: Config resource 'file [/opt/applications/exmp-configuration-service-configs/exmp-jbpm-configuration-mapping-dev.properties]' via location 'file:///opt/applications/exmp-configuration-service-configs/'
2022-09-22 13:48:28.336 INFO Request-Info:[protocol=HTTP/1.1, exmp_user_name=exmp-supervisor, method=GET, entrypoint=/api/prj-menus/unit1/1, app_server=exmp-index:8081, x_request_identifier=b2064184-7a14-4439-93d4-7b7e7126dd1a, x_active_unit1_id=1, x_menu_id=33, cause=incoming_request, calling_server=172.30.2.224, request_time=N/A, reception_date=2022-09-22T13:48:28.247+0300, time_elapsed_ms=88] 2756977 --- [XNIO-1 task-3] g.u.m.commons.logging.MDCLoggingFilter : processing_end
Disclamer: I have tested in Grok debugger, all three lines have the same fields. Only you need after this remove [ and ] with gsub from fieldx, if exists. Also you can split further fiedx ([exmp-docgen-srv,613863e75eb43d9c,613863e75eb43d9c] ),if is important.
Yes, would advise to use KV for cases fieldname=value, but test.
For this: exmp-docgen-srv,613863e75eb43d9c,613863e75eb43d9c you can also use CSV filter.
Since this sample is only a few lines, something tell me that will be anomalies. Below 100 lines per sample you cannot be sure that everything will be OK.
Your first approach(multiple matches) is not wrong, test what is faster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.