Hi,
We have noticed that sometimes, the Azure Activity logs contains serilaized json instead of pure json in the azure.activitylogs.identity field. Thus, these logs fail to index with a mapping parsing exception.
We have created a workaround by adding an extra JSON processor to the Filebeat Azure Activity Logs pipeline (filebeat-7.11.1-azure-activitylogs-pipeline in our case). I think it would be good if this could be added in the official pipeline. I'm happy to submit a PR if you want me to.
An example document with serialized json in the identity field:
{
"agent": {
"ephemeral_id": "myid",
"hostname": "myhost",
"id": "id",
"name": "filebeat-64b5dc8949-md5p9",
"type": "filebeat",
"version": "7.11.1"
},
"azure": {
"consumer_group": "filebeat",
"enqueued_time": "2021-03-25T09:27:44.332Z",
"eventhub": "myeventhub",
"offset": 2147509126360,
"sequence_number": 2773145
},
"cloud": {
"account": {},
"instance": {
"id": "myid",
"name": "myname"
},
"machine": {
"type": "Standard_E8s_v3"
},
"provider": "azure",
"region": "westeurope"
},
"ecs": {
"version": "1.7.0"
},
"event": {
"dataset": "azure.activitylogs",
"module": "azure"
},
"fileset": {
"name": "activitylogs"
},
"input": {
"type": "azure-eventhub"
},
"message": "{\"Authorization\":\"null\",\"Claims\":\"{\\\"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress\\\":\\\"Microsoft.RecoveryServices\\\"}\",\"DeploymentUnit\":\"myunit\",\"EventId\":162,\"EventName\":\"AzureBackupActivityLog\",\"ResultDescription\":\"Backup Succeeded\",\"category\":\"Administrative\",\"correlationId\":\"666a2ef3-a0a8-4bb3-a977-1d60aa5fb185\",\"durationMs\":0,\"eventName\":\"Backup\",\"identity\":\"{\\\"claims\\\":{\\\"http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress\\\":\\\"Microsoft.RecoveryServices\\\"}}\",\"level\":\"Informational\",\"location\":\"westeurope\",\"operationId\":\"bd686d52-0f93-440e-85a7-8ed03e2a8d75\",\"operationName\":\"Microsoft.RecoveryServices/vaults/backupFabrics/protectionContainers/protectedItems/backup/action\",\"operationVersion\":\"null\",\"properties\":{\"Entity Name\":\"mymachine\",\"Job Id\":\"jobid\",\"Start Time\":\"2021-03-24 21:50:40Z\"},\"resourceId\":\"/SUBSCRIPTIONS/111B1A11-11E1-1FDE-1111-11FAD111ECCA/RESOURCEGROUPS/MY-GROUP/PROVIDERS/MICROSOFT.RECOVERYSERVICES/VAULTS/RSV-DATAPROTECTION\",\"resultType\":\"Succeeded\",\"time\":\"2021-03-25T09:22:28.4002017Z\"}",
"service": {
"type": "azure"
},
"tags": [
"forwarded"
]
}
By adding this JSON processor after the first JSON processor in the pipeline, this json gets parsed and the log post gets indexed:
{
"json": {
"field": "azure.activitylogs.identity",
"if": "ctx.azure?.activitylogs?.identity instanceof String",
"ignore_failure": true
}
}