Parsing logs from text file using using logstash

Hi,

I am having logs in the txt file.I checked with kv, grok and dissect filter to ingest logs using logstash, but as logs format is not the same. Maybe I need to assign field and particular value as this is unstructured data combining the above filters.

10/12/2021 8:54:00 AM,MSGID: a60a4c82-82ea-4ff3-ac64-f4450d45e72eMobile_Number: 00000055675Job_no:408853083User_id1197341 mstrGateWay TLD
10/12/2021 8:54:03 AM,MSGID: 0e60edb1-2c22-46b4-bd00-2aaa19e8ccbeMobile_Number: 000000555675Job_no:408853085User_id1197341 mstrGateWay TLD
10/12/2021 8:54:03 AM,strDisplay: Message Id  : a60a4c82-82ea-4ff3-ac64-f4450d45e72e Done Date   : 2110120324 STAT        : UNDELIV mstrGateWay: TLD
10/12/2021 8:54:06 AM,strDisplay: Message Id  : 0e60edb1-2c22-46b4-bd00-2aaa19e8ccbe Done Date   : 2110120324 STAT        : UNDELIV mstrGateWay: TLD
10/12/2021 8:55:27 AM,MSGID: ef11c6f8-ceb5-4d64-94ff-22e6edccc089Mobile_Number: 00000091861Job_no:3027869048User_id1030354 mstrGateWay TLD
10/12/2021 8:55:37 AM,strDisplay: Message Id  : ef11c6f8-ceb5-4d64-94ff-22e6edccc089 Done Date   : 211012052535 STAT        : DELIVRD mstrGateWay: TLD

Some fields don't have space between them, "colons" are used in some fields only as in the above logs.

Fileds to extract-
timestamp>10/12/2021 8:55:37 AM
MSGID: a60a4c82-82ea-4ff3-ac64-f4450d45e72e
Mobile_Number: 00000055675
Job_no:408853083
User_id <1197341>
mstrGateWay
Done Date : 211012052535
STAT : DELIVRD
mstrGateWay: TLD

Unstructured data with no delimiters.... I would use grok. This is incomplete, but between them the patterns should include all the tricks you need

    grok {
        break_on_match => false
        match => {
            "message" => [
                "(MSGID|Message Id)\s*:\s*(?<msgid>[0-9a-z-]{32})",
                "Mobile_Number: (?<mobileNumber>[0-9]+)",
                "Job_no\s*:\s*(?<jobNumber>[0-9]+)",
                "User_id(?<userId>[0-9]+)",
                "mstrGateWay[:\s]*(?<mstrGateway>[A-Z]+)",
                "Done Date[:\s]*(?<doneDate>[0-9]+)",
                "^(?<timestamp>[^,]+)"
            ]
        }
    }

Note that I have [timestamp] last since it occurs on every line, so I avoid this bug.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.