Mutate gsub regex pattern help

I am having an issue with the regex pattern in mutate gsub function.

My current filter is below

filter {

# Log types are "SIEM_EVENTS", "TTP_EVENTS", "AUDIT_EVENTS".
	mutate {
		gsub => [ "[message]", "^<\d+>", "" ]
		gsub => [ "[message]", "^{", "" ]
		gsub => [ "[message]", "\\", "" ]
		gsub => [ "[message]", "$}", "" ]
    }	
	
		if ([message] =~ "ttpDefinition") {
			kv {
				source => "[message]"
				field_split => ", "
				value_split => ": "
			}
			mutate {
				add_tag => [ "mimecast_ttp" ]
			}
		}

		if ([message] =~ "auditType") {
			kv {
				source => "[message]"
				field_split => ", "
				value_split => ": "
			}
			mutate {
				add_tag => [ "mimecast_audit" ]
			}
		}
		else {
			kv {
				source => "[message]"
				field_split => "|"
				value_split => "="
			}
			mutate {
				add_tag => [ "mimecast_siem" ]
			}
		}
			
}

I am still currently seeing logs like this coming into Kibana

{
                "\"messageId\"" => "\"<160978436504.10440.13591654598331832802@celery3.chartio.net>\"}\u0000",
         "\"userEmailAddress\"" => "someemailaddress",
               "\"scanResult\"" => "clean",
      "\"userAwarenessAction\"" => "N/A",
                  "\"subject\"" => "[External Sender] Chartio: Red Lobster Region Level DoorDash Ops\\r\\n Report for January 04,\\r\\n 2021",
                      "message" => "\"userEmailAddress\": \"someeamiladdress\", \"fromUserEmailAddress\": \"reports@chartio.com\", \"url\": \"https://chartio.com/doordash3/red-lobster-store-operations/report/a9ff35ed1b094b6bb8e69940088755ca.pdf\", \"ttpDefinition\": \"Default URL Protection Definition\", \"subject\": \"[External Sender] Chartio: Red Lobster Region Level DoorDash Ops\\r\\n Report for January 04,\\r\\n 2021\", \"action\": \"allow\", \"adminOverride\": \"N/A\", \"userOverride\": \"None\", \"scanResult\": \"clean\", \"category\": \"Computers & Technology\", \"sendingIp\": \"159.135.231.7\", \"userAwarenessAction\": \"N/A\", \"date\": \"2021-01-04T18:53:14+0000\", \"actions\": \"Allow\", \"route\": \"inbound\", \"creationMethod\": \"User Click\", \"emailPartsDescription\": [\"Body\"], \"messageId\": \"<160978436504.10440.13591654598331832802@celery3.chartio.net>\"}\u0000",
                         "host" => "127.0.0.1",
    "\"emailPartsDescription\"" => "\"Body\"",
                      "\"url\"" => "someurlshownhere",
             "\"userOverride\"" => "None",
                "\"sendingIp\"" => "someipaddress",
                     "@version" => "1",
                    "\"route\"" => "inbound",
                 "\"category\"" => "Computers & Technology",
            "\"adminOverride\"" => "N/A",
            "\"ttpDefinition\"" => "Default URL Protection Definition",
                   "\"action\"" => "allow",
                  "\"actions\"" => "Allow",
           "\"creationMethod\"" => "User Click",
                     "\"date\"" => "2021-01-04T18:53:14+0000",
                   "@timestamp" => 2021-01-04T18:53:56.984Z,
                         "tags" => [
        [0] "mimecast_ttp",
        [1] "mimecast_siem"
    ],
     "\"fromUserEmailAddress\"" => "reports@chartio.com"
}

I am trying remove the backslashes and double quotation marks from both sides of field:value mappings in the message but nothing I am doing is correct.

Thanks for any help

Why are you not parsing this using a json filter?

Neither side of that contains a backslash. rubydebug is adding the backslash to escape the double quote. The actual name of the field is "emailPartsDescription" and its value is "Body". If you want to remove double quotes use

gsub => [ "[message]", '"', "" ]

When you say gsub => [ "[message]", "$}", "" ] I think you mean gsub => [ "[message]", "}$", "" ], but even that will not work if you have a trailing NUL at the end of [message]

@Badger,

From my knowledge I am using the json filter. So from what I am seeing in the original events that is coming from the source there are three different events that come in. SIEM events, TTP events, AUDIT events.

The SIEM events come in as a json format with a " | " delimited " = " field value split style. The other two types come in as a " , " delimited " : " field value split style.

Here is the "Original Events" that is coming from my source:

SIEM_EVENTS:

{
       "message" => "<14>datetime=2020-12-28T15:19:58-0500|aCode=1zdbuOm5PCC9F9MFCTuuSg|acc=CUSA105A194|IP=104.47.70.104|Dir=Inbound|Subject=[External Sender] Order 19141384-8706668 Confirmation|MsgId=<ZxlPTRgCQFG51wKAPeXBdw@ismtpd0007p1las1.sendgrid.net>|headerFrom=orders@eat.grubhub.com|Sender=bounces+1584392-695d-kmeghan=redlobster.com@em7352.eat.grubhub.com|Rcpt=kmeghan@redlobster.com|Act=Acc|TlsVer=TLSv1.2|Cphr=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384\u0000",
      "@version" => "1",
    "@timestamp" => 2021-01-04T21:20:39.587Z,
          "host" => "127.0.0.1"
}

TTP_EVENTS

{
       "message" => "<14>{\"userEmailAddress\": \"dhoyle@redlobster.com\", \"fromUserEmailAddress\": \"gprovins@nasrecruitment.com\", \"url\": \"https://analytics.talentegy.com/76f4f0c4-2d04-409b-9144-74babb87ceb8.js\", \"ttpDefinition\": \"Default URL Protection Definition\", \"subject\": \"[External Sender] RE: Red Lobster ACTIVATE Analytics Enhancement\", \"action\": \"allow\", \"adminOverride\": \"Allow\", \"userOverride\": \"None\", \"scanResult\": \"clean\", \"category\": \"Customer managed url allow list\", \"sendingIp\": \"40.107.70.43\", \"userAwarenessAction\": \"N/A\", \"date\": \"2021-01-04T20:58:06+0000\", \"actions\": \"Allow\", \"route\": \"inbound\", \"creationMethod\": \"User Click\", \"emailPartsDescription\": [\"Body\"], \"messageId\": \"\"}\u0000",
    "@timestamp" => 2021-01-04T20:58:26.698Z,
      "@version" => "1",
          "host" => "127.0.0.1"
}

AUDIT_EVENTS

{
       "message" => "<14>{\"id\": \"eNoVzUsOgjAUQNG9vDEDyicFZgUdYExKNaBxRmhBCFjtR9IY9y4u4J77AS06q8TIIYPGkVQX-UrZaTI7nziW4MPU04EeV-sWfqsvrEruC-ubK4mD8iWw2puqn9LyKds5Bw9ay0czy-HvBWGEMMah70FntZGLUJ3kYhsV9ZkgPyYojbbmLZQe5QMy9P0BYhAuFw\", \"auditType\": \"Logon Authentication Failed\", \"user\": \"dkrieman@redlobster.com\", \"eventTime\": \"2021-01-04T13:57:54+0000\", \"eventInfo\": \"Failed authentication for dkrieman@redlobster.com <Krieman, Danielle NON-RL>, Date: 2021-01-04, Time: 08:57:54 GMT-05:00, IP: 194.61.53.41, Application: SMTP-MTA2, Reason: Account disabled\", \"category\": \"authentication_logs\"}\u0000",
    "@timestamp" => 2021-01-04T20:58:43.640Z,
      "@version" => "1",
          "host" => "127.0.0.1"
}

I am not sure how to get this to work really.

Try

    grok { match => { "message" => "^<%{POSINT:syslog_pri}>" } }
    mutate { gsub => [ "message", "^<\d+>", "", "message", "\u0000", "" ] }
    if [message] =~ /^{/ {
        json { source => "message" remove_field => [ "message" ] }
    } else {
        kv { field_split => "|" value_split => "=" remove_field => [ "message" ] }
    }

In place of the whole filter I already have?

If those are the only three types of messages you have then that is all you need. You can replace everything you already have.

That is awesome. I am currently putting it to stdout and it looks like it is working fine.

Again, thank you. I have been racking my brain over this the whole day.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.