Parsing custom log timestamps, how?

I'm having issues figuring out how to get a timestamp out of a custom log.

I've been trying to use the dissect and timestamp processors via the Custom configurations field in the fleet policy -> custom log screen.

Here is what that looked like:

multiline:
  type: pattern
  pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  negate: true
  match: after
processors:
  - dissect:
      tokenizer: '%{(\d{4}-\d{2}-\d{2}T\d{2}\:\d{2}\:\d{2}\.\d+-\d{2}\:\d{2})}'
      field: "message"
      target_prefix: "temptime"
  - timestamp:
      field: temptime
      layouts:
        - '2006-01-02T15:04:05.999-07:00'
      test:
        - '2022-07-06T09:39:32.116114-07:00'

That just doesn't work. I also tried using the grok MONTH, YEAR, MONTHDAY, etc. patterns, but that resulted in separate fields per grok pattern.

What am I missing? Note, I'm perfectly willing to use ingest pipelines if that will work better.

This is all via Elasticsearch 8.2.3 and Elastic Agent 8.2.3.

Per Does Filebeat's timestamp processor need the source field to only contain the time? - #7 by stephenb here is a sanitized version of a log copied out of Kibana.

{
  "_index": ".ds-logs-oracle-default-2022.07.06-000001",
  "_id": "XT5THYIBBC7_TqPSSlPl",
  "_version": 1,
  "_score": 1,
  "_source": {
    "@timestamp": "2022-07-20T20:36:13.935Z",
    "log": {
      "file": {
        "path": "/path/to/oracle/log"
      },
      "flags": [
        "multiline"
      ],
      "offset": 16545725
    },
    "message": "2022-07-20T13:36:05.647792-07:00\nThread 1 advanced to log sequence 14207 (LGWR switch)\n  Current log# 2 seq# 14207 mem# 0: /ora_redo/redo_1/BANNERENV/redo02a.log\n  Current log# 2 seq# 14207 mem# 1: /ora_redo/redo_2/BANNERENV/redo02b.log\n  Current log# 2 seq# 14207 mem# 2: /ora_redo/redo_3/BANNERENV/redo02c.log",
    "data_stream": {
      "dataset": "oracle",
      "namespace": "default",
      "type": "logs"
    },
    "agent": {
      "id": "185c4974-815d-42ed-b3df-388b6aa2d2b0",
      "type": "filebeat",
      "version": "8.2.3",
      "ephemeral_id": "262fd438-8d6b-4b78-851c-0682ffed5386",
      "name": "oracledbserver.example.org"
    },
    "host": {
      "architecture": "x86_64",
      "os": {
        "platform": "ol",
        "version": "7.9",
        "family": "",
        "name": "Oracle Linux Server",
        "kernel": "3.10.0-1160.53.1.el7.x86_64",
        "type": "linux"
      },
      "id": "cea2ce0f768843d5ac13c76c03e4a478",
      "containerized": false,
      "ip": [
        "internalipv4-a",
        "internalipv6-a",
        "internalipv4-b",
        "internalipv6-b",
        "internalipv4-c",
        "internalipv6-c",
        "internalipv4-d"
      ],
      "mac": [
        "macaddress-a",
        "macaddress-b",
        "macaddress-c",
        "macaddress-d",
        "macaddress-d"
      ],
      "name": "oracledbserver.example.org",
      "hostname": "oracledbserver.example.org"
    },
    "ecs": {
      "version": "8.0.0"
    },
    "input": {
      "type": "log"
    },
    "event": {
      "dataset": "oracle"
    },
    "elastic_agent": {
      "version": "8.2.3",
      "id": "185c4974-815d-42ed-b3df-388b6aa2d2b0",
      "snapshot": false
    }
  },
  "fields": {
    "elastic_agent.version": [
      "8.2.3"
    ],
    "host.hostname": [
      "oracledbserver.example.org"
    ],
    "host.mac": [
      "macaddress-a",
      "macaddress-b",
      "macaddress-c",
      "macaddress-d",
      "macaddress-d"
    ],
    "host.ip": [
      "internalipv4-a",
      "internalipv6-a",
      "internalipv4-b",
      "internalipv6-b",
      "internalipv4-c",
      "internalipv6-c",
      "internalipv4-d"
    ],
    "agent.type": [
      "filebeat"
    ],
    "host.os.version": [
      "7.9"
    ],
    "host.os.kernel": [
      "3.10.0-1160.53.1.el7.x86_64"
    ],
    "host.os.name": [
      "Oracle Linux Server"
    ],
    "agent.name": [
      "oracledbserver.example.org"
    ],
    "host.name": [
      "oracledbserver.example.org"
    ],
    "elastic_agent.snapshot": [
      false
    ],
    "host.id": [
      "cea2ce0f768843d5ac13c76c03e4a478"
    ],
    "host.os.type": [
      "linux"
    ],
    "elastic_agent.id": [
      "185c4974-815d-42ed-b3df-388b6aa2d2b0"
    ],
    "data_stream.namespace": [
      "default"
    ],
    "input.type": [
      "log"
    ],
    "log.offset": [
      16545725
    ],
    "log.flags": [
      "multiline"
    ],
    "message": [
      "2022-07-20T13:36:05.647792-07:00\nThread 1 advanced to log sequence 14207 (LGWR switch)\n  Current log# 2 seq# 14207 mem# 0: /ora_redo/redo_1/BANNERENV/redo02a.log\n  Current log# 2 seq# 14207 mem# 1: /ora_redo/redo_2/BANNERENV/redo02b.log\n  Current log# 2 seq# 14207 mem# 2: /ora_redo/redo_3/BANNERENV/redo02c.log"
    ],
    "data_stream.type": [
      "logs"
    ],
    "host.architecture": [
      "x86_64"
    ],
    "@timestamp": [
      "2022-07-20T20:36:13.935Z"
    ],
    "agent.id": [
      "185c4974-815d-42ed-b3df-388b6aa2d2b0"
    ],
    "host.containerized": [
      false
    ],
    "ecs.version": [
      "8.0.0"
    ],
    "host.os.platform": [
      "ol"
    ],
    "log.file.path": [
      "/path/to/oracle/log"
    ],
    "data_stream.dataset": [
      "oracle"
    ],
    "agent.ephemeral_id": [
      "262fd438-8d6b-4b78-851c-0682ffed5386"
    ],
    "agent.version": [
      "8.2.3"
    ],
    "host.os.family": [
      ""
    ],
    "event.dataset": [
      "oracle"
    ]
  }
}

Thanks in advance!

Hi @jerrac What filebeat module if any are you using or are you just using a normal input... are you using the oracle module... looks like you are just using the input type log is that correct?

filebeat.inputs:
- type: log

It would probably help just to be complete to post your filebeat.yml.

In the meantime I will create a ingest pipeline

Here is the pipeline and results

DELETE _ingest/pipeline/discuss-pipeline

GET _ingest/pipeline/discuss-pipeline

# Note the special regex multiline directive (?m)
PUT _ingest/pipeline/discuss-pipeline
{
  "description": "My Discuss Pipeline",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{TIMESTAMP_ISO8601:timestamp}(?m)%{GREEDYDATA:message_details}"
        ]
      }
    },
    {
      "set": {
        "field": "@timestamp",
        "value": "{{{timestamp}}}"
      }
    }
  ]
}

# Simulate the Pipeline with a doc
POST _ingest/pipeline/discuss-pipeline/_simulate
{
    "docs": [
    {
      "_index": ".ds-logs-oracle-default-2022.07.06-000001",
      "_id": "XT5THYIBBC7_TqPSSlPl",
      "_version": 1,
      "_score": 1,
      "_source": {
        "@timestamp": "2022-07-20T20:36:13.935Z",
        "log": {
          "file": {
            "path": "/path/to/oracle/log"
          },
          "flags": [
            "multiline"
          ],
          "offset": 16545725
        },
        "message": """2022-07-20T13:36:05.647792-07:00
Thread 1 advanced to log sequence 14207 (LGWR switch)
  Current log# 2 seq# 14207 mem# 0: /ora_redo/redo_1/BANNERENV/redo02a.log
  Current log# 2 seq# 14207 mem# 1: /ora_redo/redo_2/BANNERENV/redo02b.log
  Current log# 2 seq# 14207 mem# 2: /ora_redo/redo_3/BANNERENV/redo02c.log""",
        "data_stream": {
          "dataset": "oracle",
          "namespace": "default",
          "type": "logs"
        },
        "agent": {
          "id": "185c4974-815d-42ed-b3df-388b6aa2d2b0",
          "type": "filebeat",
          "version": "8.2.3",
          "ephemeral_id": "262fd438-8d6b-4b78-851c-0682ffed5386",
          "name": "oracledbserver.example.org"
        },
        "host": {
          "architecture": "x86_64",
          "os": {
            "platform": "ol",
            "version": "7.9",
            "family": "",
            "name": "Oracle Linux Server",
            "kernel": "3.10.0-1160.53.1.el7.x86_64",
            "type": "linux"
          },
          "id": "cea2ce0f768843d5ac13c76c03e4a478",
          "containerized": false,
          "ip": [
            "internalipv4-a",
            "internalipv6-a",
            "internalipv4-b",
            "internalipv6-b",
            "internalipv4-c",
            "internalipv6-c",
            "internalipv4-d"
          ],
          "mac": [
            "macaddress-a",
            "macaddress-b",
            "macaddress-c",
            "macaddress-d",
            "macaddress-d"
          ],
          "name": "oracledbserver.example.org",
          "hostname": "oracledbserver.example.org"
        },
        "ecs": {
          "version": "8.0.0"
        },
        "input": {
          "type": "log"
        },
        "event": {
          "dataset": "oracle"
        },
        "elastic_agent": {
          "version": "8.2.3",
          "id": "185c4974-815d-42ed-b3df-388b6aa2d2b0",
          "snapshot": false
        }
      }
    }
  ]
}

# Results

{
  "docs" : [
    {
      "doc" : {
        "_index" : ".ds-logs-oracle-default-2022.07.06-000001",
        "_id" : "XT5THYIBBC7_TqPSSlPl",
        "_version" : "1",
        "_source" : {
          "agent" : {
            "name" : "oracledbserver.example.org",
            "id" : "185c4974-815d-42ed-b3df-388b6aa2d2b0",
            "type" : "filebeat",
            "ephemeral_id" : "262fd438-8d6b-4b78-851c-0682ffed5386",
            "version" : "8.2.3"
          },
          "log" : {
            "flags" : [
              "multiline"
            ],
            "file" : {
              "path" : "/path/to/oracle/log"
            },
            "offset" : 16545725
          },
          "elastic_agent" : {
            "id" : "185c4974-815d-42ed-b3df-388b6aa2d2b0",
            "version" : "8.2.3",
            "snapshot" : false
          },
          "message" : """2022-07-20T13:36:05.647792-07:00
Thread 1 advanced to log sequence 14207 (LGWR switch)
  Current log# 2 seq# 14207 mem# 0: /ora_redo/redo_1/BANNERENV/redo02a.log
  Current log# 2 seq# 14207 mem# 1: /ora_redo/redo_2/BANNERENV/redo02b.log
  Current log# 2 seq# 14207 mem# 2: /ora_redo/redo_3/BANNERENV/redo02c.log""",
          "input" : {
            "type" : "log"
          },
          "@timestamp" : "2022-07-20T13:36:05.647792-07:00",
          "ecs" : {
            "version" : "8.0.0"
          },
          "data_stream" : {
            "namespace" : "default",
            "type" : "logs",
            "dataset" : "oracle"
          },
          "host" : {
            "hostname" : "oracledbserver.example.org",
            "os" : {
              "kernel" : "3.10.0-1160.53.1.el7.x86_64",
              "name" : "Oracle Linux Server",
              "family" : "",
              "type" : "linux",
              "version" : "7.9",
              "platform" : "ol"
            },
            "containerized" : false,
            "ip" : [
              "internalipv4-a",
              "internalipv6-a",
              "internalipv4-b",
              "internalipv6-b",
              "internalipv4-c",
              "internalipv6-c",
              "internalipv4-d"
            ],
            "name" : "oracledbserver.example.org",
            "id" : "cea2ce0f768843d5ac13c76c03e4a478",
            "mac" : [
              "macaddress-a",
              "macaddress-b",
              "macaddress-c",
              "macaddress-d",
              "macaddress-d"
            ],
            "architecture" : "x86_64"
          },
          "message_details" : """
Thread 1 advanced to log sequence 14207 (LGWR switch)
  Current log# 2 seq# 14207 mem# 0: /ora_redo/redo_1/BANNERENV/redo02a.log
  Current log# 2 seq# 14207 mem# 1: /ora_redo/redo_2/BANNERENV/redo02b.log
  Current log# 2 seq# 14207 mem# 2: /ora_redo/redo_3/BANNERENV/redo02c.log""",
          "event" : {
            "dataset" : "oracle"
          },
          "timestamp" : "2022-07-20T13:36:05.647792-07:00"
        },
        "_ingest" : {
          "timestamp" : "2022-07-21T00:23:55.168647555Z"
        }
      }
    }
  ]
}

Now in the filebeat.yml set the pipeline in the assuming you did not use a module...

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["localhost:9200"]
  pipeline: discuss-pipeline

Thanks. I think that gets me where I need to be.

I'm not using Filebeat directly, just through Elastic Agent. So I created the pipeline through Kibana's UI, used the test screen to test it, and then configured the custom log policy to use it. Once the dev db is working again (anyone have spare storage they could send???) I'll trigger a log message to test that the full pipeline works.

These logs are not Oracle Audit logs. The Oracle Filebeat module specifically wants *.aud files. Which the logs I'm parsing are not.

Um, I'm also looking to parse some logs in xml. I don't see anything saying it can be done in the docs, but is there a way to pull fields out of xml parameters?

<msg time='2022-03-11T16:56:26.126-08:00' org_id='oracle' comp_id='tnslsnr'\n type='UNKNOWN' level='16' host_id='node.example.org'\n host_addr='ipv4' pid='11187'>
<txt>11-MAR-2022 16:56:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SID=DBENV)(UR=A)(CID=(PROGRAM=something)(HOST=anotherhost.example.org)(USER=user))) * (ADDRESS=(PROTOCOL=tcp)(HOST=anotheripv4)(PORT=aport)) * establish * DBENV * 0\n </txt>
</msg>

As you can see, the timestamp is in the time parameter... How would you go about parsing data like that?

Decoding XML... you would want to do that on the filebeat / agent side... see here

Then probably use an ingest pipeline to move the fields around etc...