Trouble Parsing Unstructured Logs With Custom Grok Patterns


(Alper Mehmet Özdemir) #1

I was given a task to visualize some unstructured logs in a file that contains many logs in different formats. I tried to use the grok filter for parsing but I couldn't quite understand the documentation for custom patterns. I am using Elastic Search 6.3.0 This is my .conf file

 input {
               file {
                    path => "/home/amo/unstructuredLogs.txt"
                    start_position => "beginning"
                    ignore_older => 0
            }
        }
        filter {
            if [message] =~ /":  103"/ or [message] =~ /":   93"/ or [message] =~ /":   89"/ {
                    grok {
                            match => { "message" => "(?<timestamp>(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})) (?<Message_Type>(\w{4,5}))\s+\[(?<Source>[\w\.]*)\s*:(\s){2,3}(?<Message_Code>\d{2,3})\] (?<Message>[\w\s]*):\s*(?<NPC_Event_Bytes>[\dA-F ]*)" }
                    }
                    date {
                            match => ["timestamp", "yyyy-MM-dd HH:mm:ss,SSS"]
                    }
            } else if [message] =~ /":  126"/ or [message] =~ /":  149"/ {
                    grok {
                            match => { "message" => "(?<timestamp>(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})) (?<Message_Type>(\w{4,5}))\s+\[(?<Source>[\w\.]*)\s*:(\s){2,3}(?<Message_Code>\d{2,3})\](?<Message>[\w\s\.\:\/]*)" }
                    }
            } else if [message] =~ /":  151"/ {
                    grok {
                            match => { "message" => "(?<timestamp>(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})) (?<Message_Type>(\w{4,5}))\s+\[(?<Source>[\w\.]*)\s*:(\s){2,3}(?<Message_Code>\d{2,3})\](?<Message>[\w\s]*). Reason\: (?<Reason\>([\w\s\!]+))" }
                    }
            } else if [message] =~ /":   95"/ {
                    grok {
                            match => { "message" => "(?<timestamp>(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})) (?<Message_Type>(\w){4,5})\s+\[(?<Source>[\w\.]+)\s*:\s{2,3}(?<Message_Code>\d{2,3})\] (?<Message>(\w+)) (?<Error_Type>(\w+))"}
                    }
            } else if [message] =~ /":  360"/ {
                    grok {
                            match => { "message" => "(?<timestamp>(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})) (?<Message_Type>(\w){4,5})\s+\[(?<Source>[\w\.]+)\s*:\s{2,3}(?<Message_Code>\d{2,3})\] %{GREEDYDATA:Message}"}
                    }
            } else {
                    drop {}
            }
        }
        output {
            stdout {
                    codec => rubydebug
            }
            elasticsearch {
                    hosts => ["localhost:9200"]
            }
        }

I want it so that it recognizes the fileds in the custom grok patterns and creates fields accordingly. For example "Message_Type", "Message_Code" etc.
Also why doesn't why doesn't logstash give me the stdout unless I change the file name. Is it possible to overcome that.

Here are some examples of the logs I will be parsing.

2018-03-21 18:15:42,977 WARN  [tr.com.abc.tcconnector.npc.NPCClientActor                   :  103] Incoming message from NPC is: 02 BB 10 A5 5A 05 BF 03 5A 13 3E 16 00 14 00 00 77 0E B5 56 03
2018-03-21 18:15:48,176 WARN  [tr.com.abc.tcconnector.npc.NPCClientActor                   :   93] Command CommandWithResponseNIRT to be sent to NPC is:	02 20 02 79 0A 53 03
2018-03-22 08:56:59,952 WARN  [tr.com.abc.tcconnector.npc.NPCClientActor                   :   89] Command CommandNIRT to be sent to NPC is:	09 BB 01 72 51 03
2018-03-22 09:52:48,857 WARN  [net.sf.jasperreports.engine.component.ComponentsEnvironment :  126] Found two components for namespace http://jasperreports.sourceforge.net/htmlcomponent
2018-03-22 09:52:50,483 WARN  [net.sf.jasperreports.engine.export.GenericElementHandlerEnviroment:  149] Found two generic element handler bundles for namespace http://jasperreports.sourceforge.net/jasperreports/html
2018-03-12 20:20:56,917 WARN  [tr.com.abc.tcconnector.npc.NPCClientActor                   :  151] Restarting NPC Client Service. Reason: NPC Connection problem!
2018-03-12 20:25:26,624 ERROR [tr.com.abc.tcconnector.npc.NPCAkka                          :   95] sendToNpcByAkkaWithResponse TIMOUT_ERROR

I used the grok debugger tool in Kibana to test the patterns out and there was no issue

Here is a sample desired output. (for the last log example)

{
  "Message": "sendToNpcByAkkaWithResponse",
  "Message_Type": "ERROR",
  "Message_Code": "95",
  "Error_Type": "TIMOUT_ERROR",
  "Source": "tr.com.nkr.tcconnector.npc.NPCAkka",
  "timestamp": "2018-03-12 20:25:26,624"
}

(Alper Mehmet Özdemir) #2

Help is no longer necessary. The syntax of match in the documentations and the logstash video were different. The video one worked.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.