Help extracting logs from message field with /t delimiter

hello, I would like to know how I can extract the fields from this log, using grok or another way, I've tried with grok but I completely messed up,
the question is how can i separate the fields with the /t delimiter? I even created a
mapping based on this log on how the fields would look.

Timestamp        -->    Nov  3 21:45:10
Mta_name        -->    xxx009148
log_type         -->    msgtra.imss[26101]: NormalTransac    
InTimeStamp         -->     2022 Nov  3 21:43:35 -03:00    
ScanTimeStamp         -->     2022/11/03 21:43:35 -03:00    
OutTimeStamp         -->     2022 Nov  3 21:44:07 -03:00    
Message_ID         -->    654717088.24251667522701369.xxxx.xxxxx@xxxx.xxx.xxx.xxx.br    
Internal_Message_ID     -->    xxxxx-EC9A-5D05-9E9E-xxxxxx    
Postfix_ID         -->    xxxxx40EF    
Scanner_ID         -->    1    
Sender            -->    xxxx.xxx.xxxx@xxx.xxxx.xxxx.br    
recipient        -->    recipient1@xxxxxxx.com.br;recipient2@exxxxx.xxx.br    
Subject            -->    *** TESTE/PE - xxxx xxx xxxxxx (Carga:xxxxxx; Veiculo:xxxxx) ***    
Client_IP        -->    xxxx.xx.xx.217    
Delivery_IP        -->    mail.teste.com.br[186.248.133.196]:25    
Delivery_feedback    -->    250 2.0.0 Ok: queued as 68F676046032F    
Delivery_status        -->    sent
Action            -->    00100000000000000    
Split_flag        -->    0        
Extra_Item        -->    ""
ToDeliveryTimeStamp    -->    2022 Nov  3 21:43:35 -03:00    
InDeliveryTimeStamp    -->    2022 Nov  3 21:43:35 -03:00

and here's a log sample

<135>Nov  3 21:45:10 sf009148 msgtra.imss[26101]: NormalTransac\t2022 Nov  3 21:43:35 -03:00\t2022/11/03 21:43:35 -03:00\t2022 Nov  3 21:44:07 -03:00\t654717088.24251667522701369.JavaMail.fronteiras@sf063693.xxxx.xx.xxxx.br\tDB2BF6D2-xxxxxxxxx-685CE9Axxxxx\tCB66A640EF\t1\xxx.xxxx.fronteiras@xxx.xxxx.xxx.br\xxxxx@expressonxxxxxx.com.br;xxxxio@xxxxxxxuceno.com.br\t*** xxxx/PE - Liberacao de Carga (Carga:xxxx56; Veiculo:xxxxx9) ***\t172.16.12.217\tmail.xxxxxxxmuceno.com.br[186.248.133.196]:25\t250 2.0.0 Ok: queued as 68F676046032F\tsent\t00100000000000000\t0\t\t2022 Nov  3 21:43:35 -03:00\t2022 Nov  3 21:43:35 -03:00\t\t3\t

I would suggest using grok to parse that part of the message, then use a csv filter with the separator set to \t to parse the rest. Something like

      grok {
        pattern_definitions => { "TIMESTAMP" => "%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{NOTSPACE}" }
        match => { "message" => "<%{NUMBER}>%{TIMESTAMP:Timestamp} %{WORD:Mta_name} %{GREEDYDATA:[@metadata][restOfLine]}" }
    }
    csv {
            source => "[@metadata][restOfLine]"
            separator => "  "
            columns => [ "log_type", "InTimeStamp", "ScanTimeStamp", "OutTimeStamp", "Message_ID" ]
    }
}

I'll leave you to add in the rest of the column names...

hello @Badger !, I tried to perform this parse through the dissect filter, but it's not working, can you tell me if there's something wrong with the structure of my filter?

filter {
    if "imsva" in [tags] { dissect {
      mapping => {
        "message" => "%{eventtimestamp}\t%{intimestamp}\t%{ScanTimeStamp}\t%{OutTimeStamp}\t%{Message_ID}\t%{Internal_Message_ID}\t%{Postfix_ID}\t%{Scanner_ID}\t%{Sender}\t%{Recipient}\t%{Subject}\t%{Client_IP}\t%{Delivery_IP}\t%{Delivery_feedback}\t%{Delivery_status}\t%{Action}\t%{Split_flag}\t%{Extra_Item}\t%{ToDeliveryTimeStamp}\t%{InDeliveryTimeStamp}"
      }}
    }
  }

dissect does not use regexps. You need literal tabs in your mapping.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.