Having issues parsing time in CEF

Hi all,

I'm ingesting ExtraHop Reveal X https://www.extrahop.com/products/security/ logs using Fleet Managed Elastic Agent Integration "CEF" using the SYSLOG input over UDP.

Most of the fields are being extracted correct, except the field "rt", start and "end".

"error in field 'end': value is not a valid timestamp"
"error in field 'rt': value is not a valid timestamp"
"error in field 'start': value is not a valid timestamp"

Error Message:

Sample Event:

<14>2021-11-19T20:28:16.012Z noname.no-name-company.org CEF:0|ExtraHop|Reveal(x)|7.8|1|VPN Client Data Exfiltration|6|cn1=25769823408 cn1Label=detectionID cn2=65 cn2Label=riskScore cs1=https://imaginary-company.cloud.extrahop.com/extrahop/#/detections/detail/25769823408 cs1Label=detectionURL cs2=sec,sec.action,sec.exfil cs2Label=category rt=2021-11-19T20:00:00.000Z end=2021-11-19T19:41:30.000Z start=2021-11-19T19:40:00.001Z src=10.54.10.60 dst=00:50:56:B9:7E:52 msg=[dl1c34430.imaginary-company.org](#/metrics/devices/7e2dff4fa4dc4e58889e0f0399974a66.fff4380a4a0a0000/overview?from\=1637350800&interval_type\=DT&until\=1637350890) received an unusual amount of data from internal resources.\n\nThe VPN client received:\n* 1.5GB from `10.210.3.216` over HTTP\n\n

As shown in the above sample event, fields rt=2021-11-19T20:00:00.000Z end=2021-11-19T19:41:30.000Z start=2021-11-19T19:40:00.001Z, are in the format yyyy-MM-dd'T'HH:mm:ss.SSSZ

I tried to use the following time format for these field names in the "Component Templates", but still issue persists.

yyyy-MM-dd'T'HH:mm:ss.SSS'Z'
yyyy-MM-dd'T'HH:mm:ss.SSS'X'
yyyy-MM-dd'T'HH:mm:ss.SSSZ
yyyy-MM-dd'T'HH:mm:ss.SSSX

Any help would be greatly appreciated

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

I ran that sample through the decode_cef processor in this playground. The logs are not strictly following the specification that is linked to from the decode_cef docs.

[
  {
    "@timestamp": "2023-10-23T15:06:00.288Z",
    "cef": {
      "device": {
        "event_class_id": "1",
        "product": "Reveal(x)",
        "vendor": "ExtraHop",
        "version": "7.8"
      },
      "extensions": {
        "deviceCustomNumber2Label": "riskScore",
        "deviceCustomString1": "https://imaginary-company.cloud.extrahop.com/extrahop/#/detections/detail/25769823408",
        "deviceCustomString2": "sec,sec.action,sec.exfil",
        "deviceCustomString2Label": "category",
        "message": "[dl1c34430.imaginary-company.org](#/metrics/devices/7e2dff4fa4dc4e58889e0f0399974a66.fff4380a4a0a0000/overview?from=1637350800&interval_type=DT&until=1637350890) received an unusual amount of data from internal resources.\n\nThe VPN client received:\n* 1.5GB from `10.210.3.216` over HTTP\n\n",
        "sourceAddress": "10.54.10.60",
        "deviceCustomNumber1": 25769823408,
        "deviceCustomNumber1Label": "detectionID",
        "deviceCustomNumber2": 65,
        "deviceCustomString1Label": "detectionURL"
      },
      "name": "VPN Client Data Exfiltration",
      "severity": "6",
      "version": "0"
    },
    "error": {
      "message": [
        "error in field 'end': value is not a valid timestamp",
        "error in field 'rt': value is not a valid timestamp",
        "error in field 'start': value is not a valid timestamp",
        "error in field 'dst': value is not a valid IP address"
      ]
    },
    "event": {
      "code": "1",
      "original": "<14>2021-11-19T20:28:16.012Z noname.no-name-company.org CEF:0|ExtraHop|Reveal(x)|7.8|1|VPN Client Data Exfiltration|6|cn1=25769823408 cn1Label=detectionID cn2=65 cn2Label=riskScore cs1=https://imaginary-company.cloud.extrahop.com/extrahop/#/detections/detail/25769823408 cs1Label=detectionURL cs2=sec,sec.action,sec.exfil cs2Label=category rt=2021-11-19T20:00:00.000Z end=2021-11-19T19:41:30.000Z start=2021-11-19T19:40:00.001Z src=10.54.10.60 dst=00:50:56:B9:7E:52 msg=[dl1c34430.imaginary-company.org](#/metrics/devices/7e2dff4fa4dc4e58889e0f0399974a66.fff4380a4a0a0000/overview?from\\=1637350800&interval_type\\=DT&until\\=1637350890) received an unusual amount of data from internal resources.\\n\\nThe VPN client received:\\n* 1.5GB from `10.210.3.216` over HTTP\\n\\n",
      "severity": 6
    },
    "message": "[dl1c34430.imaginary-company.org](#/metrics/devices/7e2dff4fa4dc4e58889e0f0399974a66.fff4380a4a0a0000/overview?from=1637350800&interval_type=DT&until=1637350890) received an unusual amount of data from internal resources.\n\nThe VPN client received:\n* 1.5GB from `10.210.3.216` over HTTP\n\n",
    "observer": {
      "product": "Reveal(x)",
      "vendor": "ExtraHop",
      "version": "7.8"
    },
    "source": {
      "ip": "10.54.10.60"
    }
  }
]
  • dst=00:50:56:B9:7E:52 is a mac address and not an IP
  • start, end, and rt don't follow the timestamp formats in the CEF specification (example value 2021-11-19T19:41:30.000Z). This is a screenshot from the spec of the CEF timestamp formats.

If you cannot fix the issues from the ExtraHop side you could do some preprocessing on the data to align the timestamps to the expected format and replace dst= to dmac=.

Hi Andrew, thanks for the insights.

I've moved from using CEF integration to "Custom UDP Logs" https://docs.elastic.co/integrations/udp and able to parse most of the fields correctly as shown below.

Few questions:

  • Help to confirm if there are any shortcoming or major flaws in the index template or ingest pipeline that are not recommended
  • In the custom field extrahop.detection.description, there are newline \n characters present, which I'm able to remove using processor gsub. Is there a better way there can be processed automatically, and instead a literal newline can be inserted for readability in the logs.

Ingest Pipeline for ExtraHop using Elastic Agent Integration Custom UDP Logs

{
  "description": "Pipeline for ExtraHop CEF logs.",
  "_meta": {
    "package": {
      "name": "extrahop"
    }
  },
  "processors": [
    {
      "set": {
        "field": "ecs.version",
        "value": "8.9.0"
      }
    },
    {
      "rename": {
        "field": "message",
        "target_field": "event.original",
        "ignore_missing": true
      }
    },
    {
      "dissect": {
        "field": "event.original",
        "pattern": "<%{_tmp.header}>%{_tmp.timestamp} %{extrahop.detection.host} CEF:%{cef.version}|%{cef.device.vendor}|%{cef.device.product}|%{cef.device.version}|%{cef.device.event_class_id}|%{cef.name}|%{cef.severity}|%{_tmp.cef_extensions}"
      }
    },
    {
      "grok": {
        "field": "_tmp.cef_extensions",
        "patterns": [
          "%{EXTRAHOP_MV_FIELDS:_tmp.mvfields} msg=%{EXTRAHOP_MESSAGE:_tmp.message}"
        ],
        "pattern_definitions": {
          "EXTRAHOP_MV_FIELDS": ".*(?= msg=)",
          "EXTRAHOP_MESSAGE": "(.*)"
        }
      }
    },
    {
      "kv": {
        "field": "_tmp.mvfields",
        "field_split": " ",
        "value_split": "=",
        "target_field": "_tmp"
      }
    },
    {
      "date": {
        "field": "_tmp.timestamp",
        "formats": [
          "ISO8601"
        ],
        "if": "ctx?._tmp?.timestamp != null"
      }
    },
    {
      "set": {
        "if": "ctx?._tmp?.start != \"none\"",
        "field": "extrahop.detection.startTime",
        "value": "{{_tmp.start}}"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.endTime",
        "value": "{{_tmp.end}}",
        "if": "ctx?._tmp?.end != \"none\""
      }
    },
    {
      "set": {
        "field": "extrahop.detection.deviceReceiptTime",
        "value": "{{_tmp.rt}}",
        "if": "ctx?._tmp?.rt != \"none\""
      }
    },
    {
      "grok": {
        "field": "_tmp.src",
        "patterns": [
          "%{SRC}"
        ],
        "pattern_definitions": {
          "SRC": "(%{IP:source.ip}|%{MAC:source.mac})"
        },
        "ignore_missing": true
      }
    },
    {
      "grok": {
        "field": "_tmp.dst",
        "patterns": [
          "%{DST}"
        ],
        "pattern_definitions": {
          "DST": "(%{IP:destination.ip}|%{MAC:destination.mac})"
        },
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "ignore_missing": true,
        "tag": "source geo",
        "target_field": "source.geo"
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "ignore_missing": true,
        "tag": "destination geo",
        "target_field": "destination.geo"
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "database_file": "GeoLite2-ASN.mmdb",
        "ignore_missing": true,
        "properties": [
          "asn",
          "organization_name"
        ],
        "target_field": "source.as"
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "database_file": "GeoLite2-ASN.mmdb",
        "ignore_missing": true,
        "properties": [
          "asn",
          "organization_name"
        ],
        "target_field": "destination.as"
      }
    },
    {
      "rename": {
        "field": "source.as.asn",
        "ignore_missing": true,
        "target_field": "source.as.number"
      }
    },
    {
      "rename": {
        "field": "source.as.organization_name",
        "ignore_missing": true,
        "target_field": "source.as.organization.name"
      }
    },
    {
      "rename": {
        "field": "destination.as.asn",
        "ignore_missing": true,
        "target_field": "destination.as.number"
      }
    },
    {
      "rename": {
        "field": "destination.as.organization_name",
        "ignore_missing": true,
        "target_field": "destination.as.organization.name"
      }
    },
    {
      "append": {
        "if": "ctx?.destination?.ip != null && ctx?.destination?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{destination.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.destination?.nat?.ip != null && ctx?.destination?.nat?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{destination.nat.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.source?.ip != null && ctx?.source?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{source.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.source?.nat?.ip != null && ctx?.source?.nat?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{source.nat.ip}}"
      }
    },
    {
      "gsub": {
        "field": "destination.mac",
        "ignore_missing": true,
        "pattern": "[:.]",
        "replacement": "-",
        "tag": "gsub mac"
      }
    },
    {
      "gsub": {
        "field": "source.mac",
        "ignore_missing": true,
        "pattern": "[:.]",
        "replacement": "-",
        "tag": "gsub mac"
      }
    },
    {
      "uppercase": {
        "field": "destination.mac",
        "ignore_missing": true
      }
    },
    {
      "uppercase": {
        "field": "source.mac",
        "ignore_missing": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.title",
        "value": "{{cef.name}}"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.severity",
        "value": "{{cef.severity}}",
        "ignore_empty_value": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.id",
        "value": "{{_tmp.cn1}}"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.risk_score",
        "value": "{{_tmp.cn2}}",
        "ignore_empty_value": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.url",
        "value": "{{_tmp.cs1}}"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.category",
        "value": "{{_tmp.cs2}}"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.description",
        "value": "{{_tmp.message}}"
      }
    },
    {
      "gsub": {
        "field": "extrahop.detection.description",
        "pattern": "\\.\\\\\\\\n",
        "replacement": "\\. ",
        "ignore_missing": true
      }
    },
    {
      "gsub": {
        "field": "extrahop.detection.description",
        "pattern": "\\\\\\\\n\\*",
        "replacement": " \\*",
        "ignore_missing": true
      }
    },
    {
      "gsub": {
        "field": "extrahop.detection.description",
        "pattern": "(\\\\\\\\n|\\\\\\\\)",
        "replacement": "",
        "ignore_missing": true
      }
    },
    {
      "remove": {
        "field": "event.original",
        "ignore_missing": true,
        "if": "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": [
          "cloud",
          "host",
          "_tmp"
        ],
        "ignore_missing": true
      }
    }
  ],
  "on_failure": [
    {
      "remove": {
        "field": [
          "_tmp"
        ],
        "ignore_missing": true
      }
    },
    {
      "append": {
        "field": "error.message",
        "value": "Processor \"{{ _ingest.on_failure_processor_type }}\" with tag \"{{ _ingest.on_failure_processor_tag }}\" in pipeline \"{{ _ingest.on_failure_pipeline }}\" failed with message \"{{ _ingest.on_failure_message }}\""
      }
    },
    {
      "set": {
        "field": "event.kind",
        "value": "pipeline_error"
      }
    }
  ]
}

Index Template

{
  "priority": 500,
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "logs"
        },
        "codec": "best_compression",
        "default_pipeline": "logs-extrahop-pipeline-v1.0.0",
        "mapping": {
          "total_fields": {
            "limit": "10000"
          },
          "ignore_malformed": "true"
        },
        "query": {
          "default_field": [
            "input.type",
            "log.file.path",
            "log.flags",
            "log.source.address",
            "cef.device.event_class_id",
            "cef.device.product",
            "cef.device.vendor",
            "cef.device.version",
            "cef.name",
            "cef.severity",
            "cef.version",
            "destination.ip",
            "destination.mac",
            "destination.service.name",
            "extrahop.detection.category",
            "extrahop.detection.description",
            "extrahop.detection.deviceReceiptTime",
            "extrahop.detection.endTime",
            "extrahop.detection.host",
            "extrahop.detection.id",
            "extrahop.detection.risk_score",
            "extrahop.detection.severity",
            "extrahop.detection.startTime",
            "extrahop.detection.title",
            "extrahop.detection.url",
            "source.ip",
            "source.mac",
            "source.service.name",
            "related.ip"
          ]
        }
      }
    },
    "mappings": {
      "_routing": {
        "required": false
      },
      "numeric_detection": false,
      "dynamic_date_formats": [
        "strict_date_optional_time",
        "yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"
      ],
      "_source": {
        "excludes": [],
        "includes": [],
        "enabled": true
      },
      "dynamic": true,
      "dynamic_templates": [],
      "date_detection": true,
      "properties": {
        "extrahop": {
          "type": "object",
          "properties": {
            "detection": {
              "type": "object",
              "properties": {
                "severity": {
                  "type": "keyword"
                },
                "risk_score": {
                  "type": "long"
                },
                "host": {
                  "type": "keyword"
                },
                "description": {
                  "type": "keyword"
                },
                "startTime": {
                  "type": "date"
                },
                "endTime": {
                  "type": "date"
                },
                "id": {
                  "type": "long"
                },
                "category": {
                  "type": "keyword"
                },
                "deviceReceiptTime": {
                  "type": "date"
                },
                "title": {
                  "type": "keyword"
                },
                "url": {
                  "type": "keyword"
                }
              }
            }
          }
        },
        "input": {
          "type": "object",
          "properties": {
            "type": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "@timestamp": {
          "ignore_malformed": false,
          "type": "date"
        },
        "related": {
          "type": "object",
          "properties": {
            "ip": {
              "type": "ip"
            }
          }
        },
        "cef": {
          "type": "object",
          "properties": {
            "severity": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "device": {
              "type": "object",
              "properties": {
                "product": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "event_class_id": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "vendor": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "version": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "version": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        },
        "log": {
          "type": "object",
          "properties": {
            "file": {
              "type": "object",
              "properties": {
                "path": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "offset": {
              "type": "long"
            },
            "flags": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "source": {
              "type": "object",
              "properties": {
                "address": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            }
          }
        },
        "data_stream": {
          "type": "object",
          "properties": {
            "namespace": {
              "type": "constant_keyword"
            },
            "type": {
              "type": "constant_keyword"
            },
            "dataset": {
              "type": "constant_keyword"
            }
          }
        },
        "destination": {
          "type": "object",
          "properties": {
            "service": {
              "type": "object",
              "properties": {
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "ip": {
              "type": "ip"
            },
            "mac": {
              "type": "keyword"
            }
          }
        },
        "source": {
          "type": "object",
          "properties": {
            "service": {
              "type": "object",
              "properties": {
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "ip": {
              "type": "ip"
            },
            "mac": {
              "type": "keyword"
            }
          }
        },
        "event": {
          "type": "object",
          "properties": {
            "module": {
              "eager_global_ordinals": false,
              "norms": false,
              "index": true,
              "store": false,
              "type": "keyword",
              "split_queries_on_whitespace": false,
              "index_options": "docs",
              "doc_values": true
            },
            "dataset": {
              "eager_global_ordinals": false,
              "norms": false,
              "index": true,
              "store": false,
              "type": "keyword",
              "split_queries_on_whitespace": false,
              "index_options": "docs",
              "doc_values": true
            }
          }
        }
      }
    }
  },
  "index_patterns": [
    "logs-extrahop-*"
  ],
  "data_stream": {
    "hidden": false,
    "allow_custom_routing": false
  },
  "composed_of": [
    ".fleet_globals-1",
    ".fleet_agent_id_verification-1"
  ],
  "_meta": {
    "package": {
      "name": "extrahop"
    }
  }
}

Normally you would not do anything to a string that contains \n and it would be rendered as a newline by Kibana.

I think you are having issues because you use mustache templates in your pipeline. The simplest solution to fix the extra escapes that are being added to to change everywhere that has {{ foo }} to be {{{ foo }}}. This will stop the escaping from being applied.

From mustache(5) - Logic-less templates.

All variables are HTML escaped by default. If you want to return unescaped HTML, use the triple mustache: {{{name}}}.

A better fix is to avoid using templates entirely, and use set with copy_from instead of value.

To anyone that wants an example of pre-processing the data to make it work with decode_cef. Here's an example

- convert:
    mode: copy
    fields:
      - { from: "message", to: "event.original" }

- script:
    lang: javascript
    source: |
      var extrahop = (function () {
          var processor = require("processor");

          var dstMacRegex = / dst=(([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2}))/gm;
          var srcMacRegex = / src=(([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2}))/gm;

          // Extrahop uses the wrong field names for MAC addresses.
          var fixMacAddressFields = function(evt) {
            var msg = evt.Get("message");
            msg = msg.replace(dstMacRegex, " dmac=$1");
            msg = msg.replace(srcMacRegex, " smac=$1");
            evt.Put("message", msg);
          };

          var timeRegex = / (rt|start|end)=(\d{4}-\d{2}-\d{2}T\d{1,2}:\d{2}:\d{2}.\d+Z)/gm;
          var timeSubstitution = "$2";

          // Extrahop does not format time stamps as per the CEF spec. So convert them to
          // to unix epoch in milliseconds.
          var fixTimestamps = function(evt) {
            var msg = evt.Get("message")
            msg = msg.replace(timeRegex, function(match, key, time, offset, whole, groups) {
              return " " + key + "=" + Date.parse(time);
            })
            evt.Put("message", msg);
          };

          var newlineRegex = /(?:\r\n|\r|\n)/g;

          // Newline (aka line feed) characters are supposed to be encoded a '\n', but
          // on the wire 0xA was being received.
          var encodeNewline = function(evt) {
            var msg = evt.Get("message");
            msg = msg.replace(newlineRegex, "\\n")
            msg = msg.trim();
            evt.Put("message", msg);
          }

          var extrahopProcessor = new processor.Chain()
              .Add(encodeNewline)
              .Add(fixTimestamps)
              .Add(fixMacAddressFields)
              .Build();

          return {
              process: function (evt) {
                  extrahopProcessor.Run(evt);
              },
          };
      })();

      function process(evt) {
          return extrahop.process(evt);
      }

Tried both set with copy_from and using the {{{FIELD}}} instead of {{FIELD}}. Same results – description field still contain \n

New Ingest Pipeline

PUT _ingest/pipeline/logs-extrahop-pipeline-v1.0.1
{
  "description": "Pipeline for ExtraHop CEF logs.",
  "_meta": {
    "package": {
      "name": "extrahop"
    }
  },
  "processors": [
    {
      "set": {
        "field": "ecs.version",
        "value": "8.9.0"
      }
    },
    {
      "set": {
        "field": "event.original",
        "copy_from": "message"
      }
    },
    {
      "dissect": {
        "field": "event.original",
        "pattern": "<%{_tmp.header}>%{_tmp.timestamp} %{extrahop.detection.host} CEF:%{cef.version}|%{cef.device.vendor}|%{cef.device.product}|%{cef.device.version}|%{cef.device.event_class_id}|%{cef.name}|%{cef.severity}|%{_tmp.cef_extensions}"
      }
    },
    {
      "grok": {
        "field": "_tmp.cef_extensions",
        "patterns": [
          "%{EXTRAHOP_MVFIELDS:_tmp.extrahop_mvfields} msg=%{EXTRAHOP_MESSAGE:_tmp.extrahop_message}"
        ],
        "pattern_definitions": {
          "EXTRAHOP_MVFIELDS": ".*(?= msg=)",
          "EXTRAHOP_MESSAGE": "(.*)"
        }
      }
    },
    {
      "kv": {
        "field": "_tmp.extrahop_mvfields",
        "field_split": " ",
        "value_split": "=",
        "target_field": "_tmp"
      }
    },
    {
      "date": {
        "field": "_tmp.timestamp",
        "formats": [
          "ISO8601"
        ],
        "if": "ctx?._tmp?.timestamp != null"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.startTime",
        "copy_from": "_tmp.start",
        "if": "ctx?._tmp?.start != \"none\""
      }
    },
    {
      "set": {
        "field": "extrahop.detection.endTime",
        "copy_from": "_tmp.end",
        "if": "ctx?._tmp?.end != \"none\""
      }
    },
    {
      "set": {
        "field": "extrahop.detection.deviceReceiptTime",
        "copy_from": "_tmp.rt",
        "if": "ctx?._tmp?.rt != \"none\""
      }
    },
    {
      "grok": {
        "field": "_tmp.src",
        "patterns": [
          "%{SRC}"
        ],
        "pattern_definitions": {
          "SRC": "(%{IP:source.ip}|%{MAC:source.mac})"
        },
        "ignore_missing": true
      }
    },
    {
      "grok": {
        "field": "_tmp.dst",
        "patterns": [
          "%{DST}"
        ],
        "pattern_definitions": {
          "DST": "(%{IP:destination.ip}|%{MAC:destination.mac})"
        },
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "ignore_missing": true,
        "tag": "source geo",
        "target_field": "source.geo"
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "ignore_missing": true,
        "tag": "destination geo",
        "target_field": "destination.geo"
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "database_file": "GeoLite2-ASN.mmdb",
        "ignore_missing": true,
        "properties": [
          "asn",
          "organization_name"
        ],
        "target_field": "source.as"
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "database_file": "GeoLite2-ASN.mmdb",
        "ignore_missing": true,
        "properties": [
          "asn",
          "organization_name"
        ],
        "target_field": "destination.as"
      }
    },
    {
      "rename": {
        "field": "source.as.asn",
        "ignore_missing": true,
        "target_field": "source.as.number"
      }
    },
    {
      "rename": {
        "field": "source.as.organization_name",
        "ignore_missing": true,
        "target_field": "source.as.organization.name"
      }
    },
    {
      "rename": {
        "field": "destination.as.asn",
        "ignore_missing": true,
        "target_field": "destination.as.number"
      }
    },
    {
      "rename": {
        "field": "destination.as.organization_name",
        "ignore_missing": true,
        "target_field": "destination.as.organization.name"
      }
    },
    {
      "append": {
        "if": "ctx?.destination?.ip != null && ctx?.destination?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{destination.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.destination?.nat?.ip != null && ctx?.destination?.nat?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{destination.nat.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.source?.ip != null && ctx?.source?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{source.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.source?.nat?.ip != null && ctx?.source?.nat?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{source.nat.ip}}"
      }
    },
    {
      "gsub": {
        "field": "destination.mac",
        "ignore_missing": true,
        "pattern": "[:.]",
        "replacement": "-",
        "tag": "gsub mac"
      }
    },
    {
      "gsub": {
        "field": "source.mac",
        "ignore_missing": true,
        "pattern": "[:.]",
        "replacement": "-",
        "tag": "gsub mac"
      }
    },
    {
      "uppercase": {
        "field": "destination.mac",
        "ignore_missing": true
      }
    },
    {
      "uppercase": {
        "field": "source.mac",
        "ignore_missing": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.title",
        "copy_from": "cef.name"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.severity",
        "copy_from": "cef.severity",
        "ignore_empty_value": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.id",
        "copy_from": "_tmp.cn1"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.risk_score",
        "copy_from": "_tmp.cn2",
        "ignore_empty_value": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.url",
        "copy_from": "_tmp.cs1"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.category",
        "copy_from": "_tmp.cs2"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.description",
        "value": "{{{_tmp.extrahop_message}}}"
      }
    },
    {
      "remove": {
        "field": "event.original",
        "ignore_missing": true,
        "if": "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": [
          "cloud",
          "host",
          "_tmp",
          "message"
        ],
        "ignore_missing": true
      }
    }
  ],
  "on_failure": [
    {
      "remove": {
        "field": [
          "_tmp"
        ],
        "ignore_missing": true
      }
    },
    {
      "append": {
        "field": "error.message",
        "value": "Processor \"{{ _ingest.on_failure_processor_type }}\" with tag \"{{ _ingest.on_failure_processor_tag }}\" in pipeline \"{{ _ingest.on_failure_pipeline }}\" failed with message \"{{ _ingest.on_failure_message }}\""
      }
    },
    {
      "set": {
        "field": "event.kind",
        "value": "pipeline_error"
      }
    }
  ]
}

Newlines within CEF extension values are encoded as \n. decode_cef does the conversion back to a ascii line feed (0x0A) automatically for each extension.

To replicate that in your pipeline, immediately after the set event.original processor, try adding:

{
  "gsub": {
    "field": "message",
    "pattern": "\\[n]",
    "replacement": "\n"
  }
},

Not sure if I understand the ask. I tried the above in the ingest pipeline, it add another extra newline character \n. In the ingest pipeline I created/shared above, I'm not making use of decode_cef.

I'm trying to remove all \n characters and \ from the field extrahop.detection.description

(#/metrics/devices/7e2dff4fa4dc4e58889e0f0399974a66.fff4380a4a0a0000/overview?from\=1637350800&interval_type\=DT&until\=1637350890) received an unusual amount of data from internal resources.\n\nThe VPN client received:\n* 1.5GB from 10.210.3.216 over HTTP\n\n

convert to following:

(#/metrics/devices/7e2dff4fa4dc4e58889e0f0399974a66.fff4380a4a0a0000/overview?from=1637350800&interval_type=DT&until=1637350890) received an unusual amount of data from internal resources. The VPN client received: * 1.5GB from `10.210.3.216` over HTTP

Can you share an event in JSON format that contains the event.original value as received by the UDP input. It's not clear to me whether extrahop is sending 0xA or 0x5C6E for newline. Seeing the data encoded as JSON should remove any ambiguity.

Here are two sample logs captured off the wire. It's in CEF Format, not JSON, same what appears under the field message, later moved to field event.original using the set processor at the ingest pipeline.

<14>2023-10-22T14:19:35.944Z xdccxehoroda.sterling-cooper.org CEF:0|ExtraHop|Reveal(x)|7.8|1|Kerberos Brute Force|6|cn1=30064968014 cn1Label=detectionID cn2=60 cn2Label=riskScore cs1=https://steerlingcooperkc.cloud.extrahop.com/extrahop/#/detections/detail/30064968014 cs1Label=detectionURL cs2=sec.exploit,sec,sec.attack cs2Label=category rt=2023-10-22T14:13:30.000Z end=2023-10-22T14:12:00.000Z start=2023-10-22T19:03:00.001Z src=10.11.56.195 dst=00:50:56:94:3E:F9 msg=[KC1T42206](#/metrics/devices/1260b20b154e4859bfd314600ff363be.fff4c3380b0a0000/overview?from\=1697137380&interval_type\=DT&until\=1697206320) received an unusually high number of the KDC_ERR_PREAUTH_FAILED error, which indicates failed login attempts with an invalid password. An attacker might have compromised this client and is attempting to guess passwords.\n\nTargeted accounts:\n\n* kc1t42206STEERLING\-COOPER\.ORG\n

<14>2023-10-22T14:43:08.892Z xdccxehoroda.sterling-cooper.org CEF:0|ExtraHop|Reveal(x)|7.8|1|Certificate Expiration Warning|3|cn1=25769990748 cn1Label=detectionID cn2=35 cn2Label=riskScore cs1=https://steerlingcooperkc.cloud.extrahop.com/extrahop/#/detections/detail/25769990748 cs1Label=detectionURL cs2=sec,sec.caution,sec.attack cs2Label=category rt=2023-10-22T14:43:08.785Z end=none start=2023-10-22T12:32:42.122Z src=00:50:56:AA:56:BB msg=A certificate with a subject of PD1MJESONSLS1V.sterling-cooper.org and an expiration of Fri Nov 17 2023 was observed on the offender(s). This certificate is valid for 34 more day(s).\n\n* This certificate serial number is: 7d557fb13316ec8740c032f83b2caf86\n* The issuer is PD1MJESONSLS1V.sterling-cooper.org

Well, this is true of proper CEF encoding, but not of ExtraHop according to those samples. You have a raw 0xA characters coming in over the wire. So if you wanted to replace those then I would add a gsub before you begin the parsing. Like this

   - set:
       field: event.original
       copy_from: message
+  - gsub:
+      field: message
+      pattern: "\\R"
+      replacement: " "
   - dissect:
-      field: event.original
+      field: message

I updated my example of using Beats processors to correct the data and moved it into ExtraHop CEF logging to Filebeat · GitHub. You could use that processors config with the Custom UDP input. And then handle the renames for custom fields via the ingest node pipeline.

This has no effect at all. There are no matching patterns \\R in the field message.

Regarding "Raw Event 2", for some reason it has escaped the dash and the dot. So it has \- and \. and those are not valid escape sequences in CEF.

I wonder if they were meant to be escapes, in which case we should probably just remove the escape and leave STEERLING-COOPER.ORG. WDYT?

I captured some more events, most of the times, - and . are escaped. This is one of the other reasons, where I had to avoid use of processor decode_cef for ExtraHop logs, and make use of Integration Custom UDP Logs and parse these logs using dissect and grok to have complete control over parsing. With the custom ingest pipeline, I shared earlier, Kibana is not respecting the newline \n characters, everything else seems to be good.

Seems like some success.

What's changed:

  • Introduced a processor Gsub for pattern \\\\n replaced with \n.
  • Introduced a processor Gsub for pattern \\\\ replaced with `` ; i.e., EMPTY.

Notes | Concerns

In the Kibana Console though, for the first Gsub processor; I see nothing, even though while creating the Ingest Pipeline using Dev Tools, I added \n. See Screenshot ingest_pipeline_00.png

Also, vice-versa is true, if I try to add \n using the Kibana Console, See ingest_pipeline_01.png, in the API call – it gets changed to \\n which is not what I want.

So I went the Dev Tools method to create this Ingest Pipeline to replace \\\\n with \n, not replace \\\\n with \\n.

In short, \n can be added using the Dev Tools in the replacement section of Gsub, but not through the UI.


Results

Here are the results in the Kibana, where we can see newline characters are being interpolated in the correct format.


Ingest Pipeline

PUT _ingest/pipeline/logs-extrahop-pipeline-v1.0.5
{
  "description": "Pipeline for ExtraHop CEF logs.",
  "_meta": {
    "package": {
      "name": "extrahop"
    }
  },
  "processors": [
    {
      "set": {
        "field": "ecs.version",
        "value": "8.9.0"
      }
    },
    {
      "set": {
        "field": "event.original",
        "copy_from": "message"
      }
    },
    {
      "rename": {
        "field": "message",
        "target_field": "_tmp.message"
      }
    },
    {
      "dissect": {
        "field": "_tmp.message",
        "pattern": "<%{_tmp.header}>%{_tmp.timestamp} %{extrahop.detection.host} CEF:%{cef.version}|%{cef.device.vendor}|%{cef.device.product}|%{cef.device.version}|%{cef.device.event_class_id}|%{cef.name}|%{cef.severity}|%{_tmp.cef_extensions}"
      }
    },
    {
      "grok": {
        "field": "_tmp.cef_extensions",
        "patterns": [
          "(?m)%{EXTRAHOP_MVFIELDS:_tmp.extrahop_mvfields} msg=%{GREEDYDATA:_tmp.extrahop_message}"
        ],
        "pattern_definitions": {
          "EXTRAHOP_MVFIELDS": ".*(?= msg=)"
        }
      }
    },
    {
      "kv": {
        "field": "_tmp.extrahop_mvfields",
        "field_split": " ",
        "value_split": "=",
        "target_field": "_tmp"
      }
    },
    {
      "date": {
        "field": "_tmp.timestamp",
        "formats": [
          "ISO8601"
        ],
        "if": "ctx?._tmp?.timestamp != null"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.startTime",
        "copy_from": "_tmp.start",
        "if": "ctx?._tmp?.start != \"none\""
      }
    },
    {
      "set": {
        "field": "extrahop.detection.endTime",
        "copy_from": "_tmp.end",
        "if": "ctx?._tmp?.end != \"none\""
      }
    },
    {
      "set": {
        "field": "extrahop.detection.deviceReceiptTime",
        "copy_from": "_tmp.rt",
        "if": "ctx?._tmp?.rt != \"none\""
      }
    },
    {
      "grok": {
        "field": "_tmp.src",
        "patterns": [
          "%{SRC}"
        ],
        "pattern_definitions": {
          "SRC": "(%{IP:source.ip}|%{MAC:source.mac})"
        },
        "ignore_missing": true
      }
    },
    {
      "grok": {
        "field": "_tmp.dst",
        "patterns": [
          "%{DST}"
        ],
        "pattern_definitions": {
          "DST": "(%{IP:destination.ip}|%{MAC:destination.mac})"
        },
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "ignore_missing": true,
        "tag": "source geo",
        "target_field": "source.geo"
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "ignore_missing": true,
        "tag": "destination geo",
        "target_field": "destination.geo"
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "database_file": "GeoLite2-ASN.mmdb",
        "ignore_missing": true,
        "properties": [
          "asn",
          "organization_name"
        ],
        "target_field": "source.as"
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "database_file": "GeoLite2-ASN.mmdb",
        "ignore_missing": true,
        "properties": [
          "asn",
          "organization_name"
        ],
        "target_field": "destination.as"
      }
    },
    {
      "rename": {
        "field": "source.as.asn",
        "ignore_missing": true,
        "target_field": "source.as.number"
      }
    },
    {
      "rename": {
        "field": "source.as.organization_name",
        "ignore_missing": true,
        "target_field": "source.as.organization.name"
      }
    },
    {
      "rename": {
        "field": "destination.as.asn",
        "ignore_missing": true,
        "target_field": "destination.as.number"
      }
    },
    {
      "rename": {
        "field": "destination.as.organization_name",
        "ignore_missing": true,
        "target_field": "destination.as.organization.name"
      }
    },
    {
      "append": {
        "if": "ctx?.destination?.ip != null && ctx?.destination?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{destination.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.destination?.nat?.ip != null && ctx?.destination?.nat?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{destination.nat.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.source?.ip != null && ctx?.source?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{source.ip}}"
      }
    },
    {
      "append": {
        "if": "ctx?.source?.nat?.ip != null && ctx?.source?.nat?.ip != ''",
        "field": "related.ip",
        "allow_duplicates": false,
        "value": "{{source.nat.ip}}"
      }
    },
    {
      "gsub": {
        "field": "destination.mac",
        "ignore_missing": true,
        "pattern": "[:.]",
        "replacement": "-",
        "tag": "gsub mac"
      }
    },
    {
      "gsub": {
        "field": "source.mac",
        "ignore_missing": true,
        "pattern": "[:.]",
        "replacement": "-",
        "tag": "gsub mac"
      }
    },
    {
      "uppercase": {
        "field": "destination.mac",
        "ignore_missing": true
      }
    },
    {
      "uppercase": {
        "field": "source.mac",
        "ignore_missing": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.title",
        "copy_from": "cef.name"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.severity",
        "copy_from": "cef.severity",
        "ignore_empty_value": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.id",
        "copy_from": "_tmp.cn1"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.risk_score",
        "copy_from": "_tmp.cn2",
        "ignore_empty_value": true
      }
    },
    {
      "set": {
        "field": "extrahop.detection.url",
        "copy_from": "_tmp.cs1"
      }
    },
    {
      "set": {
        "field": "extrahop.detection.category",
        "copy_from": "_tmp.cs2"
      }
    },
    {
      "gsub": {
        "field": "_tmp.extrahop_message",
        "pattern": "\\\\n",
        "replacement": "\n"
      }
    },
    {
      "gsub": {
        "field": "_tmp.extrahop_message",
        "pattern": "\\\\",
        "replacement": ""
      }
    },
    {
      "set": {
        "field": "extrahop.detection.description",
        "copy_from": "_tmp.extrahop_message"
      }
    },
    {
      "remove": {
        "field": "event.original",
        "ignore_missing": true,
        "if": "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": [
          "cloud",
          "host",
          "_tmp"
        ],
        "ignore_missing": true
      }
    }
  ],
  "on_failure": [
    {
      "remove": {
        "field": [
          "_tmp"
        ],
        "ignore_missing": true
      }
    },
    {
      "append": {
        "field": "error.message",
        "value": "Processor \"{{ _ingest.on_failure_processor_type }}\" with tag \"{{ _ingest.on_failure_processor_tag }}\" in pipeline \"{{ _ingest.on_failure_pipeline }}\" failed with message \"{{ _ingest.on_failure_message }}\""
      }
    },
    {
      "set": {
        "field": "event.kind",
        "value": "pipeline_error"
      }
    }
  ]
}

This topic was automatically closed after 8 days. New replies are no longer allowed.