How filebeat custom parse rules?

I have a project. it will output a spiecal log like:
| xxx | xxx | xxx |
| xxx | xxx | xxx |
| xxx | xxx | xxx |
I want to know it there anyway to custom self parse rules or some plugins to help me ?

Hi @fansehep Welcome to the community!.

If your file is truly that simple, you could look at the dissect parser.

It doesn't looks very simple. it has some other lines. I am not sure the dissect strings is the best way. What about the filebeat processors module? In fact I want to parse it by myself code.

That is about as simple as it gets... if you provide some sample log lines and the field names we can show you.

Dissect is one of the filebeat processors.

Apologies I am not sure what you mean

In fact this log file like:

===============table_tiltle=====================
                                    |   TOTAL|    FAIL| FailNor| FailErr|  SUMVAL|  AVG(ms)|  MAX(ms)|  MIN(ms)|    >2ms|   >10ms|   >50ms|  >100ms|MAX_RECORD           |
name1.                              |      12|       0|       0|       0|       0| 0.009333|    0.012|    0.007|       0|       0|       0|       0|                     |
name2                               |      12|       0|       0|       0|       0| 0.131917|    0.157|    0.108|       0|       0|       0|       0|                     |
name3                               |      12|       0|       0|       0|       0| 0.140667|    0.167|    0.117|       0|       0|       0|       0|                     |
name4.                              |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                     |
--------------------------------------------------------------------
ALL                                 | xxx    | xxx    | xxx    | ...   
    

If the dissect can solve this problem. that's very nice! :slight_smile:
I mean is here some way to custom the processor module?

That is not a normal file :slight_smile:

Filebeat is made to read CSV, or Delimeted Files, or Log Files or IOT files

I am not sure if that "Kinda Sort of " like your file or actually the file .... with the = signs and all that... and dashes - and all

Could we parse that file ... yes probably but I would probably do it with ingest pipeline etc... not just filebeat processors...

I fixed your post... yes we can probably parse that file would be better if you could post

It looks Pipe delimited |

Is the ALL row Totals?

this is the real log. it has = and |.
Yep ALL is row Totals.

etc... How parse it? :sob:

I'm confused about this, if there is any way I can add custom parsing rules in filebeat or with the help of processor ?

Yes, it can be parsed (perhaps not perfectly), I think, but I will need to look at it tomorrow... I would probably use an ingest pipeline that runs in Elastic not a filebeat parser.

custom parsing in filebeat uses processors... it is not 2 different things...

1 Like

Thanks for your reply. I hope you can help me with this. :blush:

Here is my code. Of course since this was not a simple direct / normal file it took a bit more. See if you can follow what I did.. You may need to make adjustments if your file is different.

My Log File

===============table_tiltle=====================
                                    |   TOTAL|    FAIL| FailNor| FailErr|  SUMVAL|  AVG(ms)|  MAX(ms)|  MIN(ms)|    >2ms|   >10ms|   >50ms|  >100ms|MAX_RECORD           |
name1                               |      12|       0|       0|       0|       0| 0.009333|    0.012|    0.007|       0|       0|       0|       0|                 1234|
name2                               |      12|       0|       0|       0|       0| 0.131917|    0.157|    0.108|       0|       0|       0|       0|                 2345|
name3                               |      12|       0|       0|       0|       0| 0.140667|    0.167|    0.117|       0|       0|       0|       0|                 3456|
name4                               |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|
--------------------------------------------------------------------
TOTALS                              |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|

My filebeat.yml

filebeat.inputs:
- type: filestream
  id: my-filestream-id
  enabled: true
  paths:
    - /Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log

setup.kibana:

output.elasticsearch:
  hosts: ["localhost:9200"]

processors:

  # Dissect the rows 
  - dissect:
      tokenizer: "%{NAME}|%{TOTAL}|%{FAIL}|%{FailNor}|%{FailErr}|%{SUMVAL}|%{AVG_ms}|%{MAX_ms}|%{MIN_ms}|%{GT_2ms}|%{GT_10ms}|%{GT_50ms}|%{GT_100ms}|%{MAX_RECORD}|"
      field: "message"
      target_prefix: ""
      trim_values: "all"

  # Drop the bad rows    
  - drop_event:
      when:
        or:
          - contains:
              log.flags: "dissect_parsing_error"
          - equals:
              NAME: ""

  # Convert the fields             
  - convert:
      fields:
        - {from: "TOTAL",   type: "integer"}
        - {from: "FAIL",    type: "integer"}
        - {from: "FailNor", type: "integer"}
        - {from: "FailErr", type: "integer"}
        - {from: "SUMVAL",  type: "integer"}
        - {from: "AVG_ms",  type: "float"}
        - {from: "MAX_ms",  type: "float"}
        - {from: "MIN_ms",  type: "float"}
        - {from: "GT_2ms",  type: "integer"}
        - {from: "GT_10ms", type: "integer"}
        - {from: "GT_10ms", type: "integer"}
        - {from: "GT_50ms", type: "integer"}
        - {from: "GT_10ms", type: "integer"}
        - {from: "GT_100ms",   type: "integer"}
        - {from: "MAX_RECORD", type: "integer"}
      ignore_missing: true
      fail_on_error: false

Results

GET filebeat*/_search

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 5,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "vLRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "AVG_ms": 0.009333,
          "NAME": "name1",
          "TOTAL": 12,
          "message": "name1                               |      12|       0|       0|       0|       0| 0.009333|    0.012|    0.007|       0|       0|       0|       0|                 1234|",
          "input": {
            "type": "filestream"
          },
          "ecs": {
            "version": "8.0.0"
          },
          "FailNor": 0,
          "GT_50ms": 0,
          "MAX_RECORD": 1234,
          "log": {
            "offset": 220,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          },
          "agent": {
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat",
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853"
          },
          "GT_10ms": 0,
          "MIN_ms": 0.007,
          "SUMVAL": 0,
          "FAIL": 0,
          "FailErr": 0,
          "host": {
            "name": "hyperion"
          },
          "GT_100ms": 0,
          "GT_2ms": 0,
          "MAX_ms": 0.012
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "vbRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "input": {
            "type": "filestream"
          },
          "FailNor": 0,
          "AVG_ms": 0.131917,
          "SUMVAL": 0,
          "GT_100ms": 0,
          "MAX_ms": 0.157,
          "host": {
            "name": "hyperion"
          },
          "MAX_RECORD": 2345,
          "GT_2ms": 0,
          "NAME": "name2",
          "GT_10ms": 0,
          "log": {
            "offset": 391,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          },
          "FAIL": 0,
          "MIN_ms": 0.108,
          "message": "name2                               |      12|       0|       0|       0|       0| 0.131917|    0.157|    0.108|       0|       0|       0|       0|                 2345|",
          "ecs": {
            "version": "8.0.0"
          },
          "agent": {
            "name": "hyperion",
            "type": "filebeat",
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85"
          },
          "FailErr": 0,
          "TOTAL": 12,
          "GT_50ms": 0
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "vrRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "GT_2ms": 0,
          "MAX_ms": 0.167,
          "log": {
            "offset": 562,
            "file": {
              "inode": "96220320",
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221"
            }
          },
          "message": "name3                               |      12|       0|       0|       0|       0| 0.140667|    0.167|    0.117|       0|       0|       0|       0|                 3456|",
          "input": {
            "type": "filestream"
          },
          "host": {
            "name": "hyperion"
          },
          "GT_100ms": 0,
          "TOTAL": 12,
          "ecs": {
            "version": "8.0.0"
          },
          "SUMVAL": 0,
          "NAME": "name3",
          "FailErr": 0,
          "GT_50ms": 0,
          "FailNor": 0,
          "AVG_ms": 0.140667,
          "agent": {
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat",
            "version": "8.11.1"
          },
          "MIN_ms": 0.117,
          "GT_10ms": 0,
          "MAX_RECORD": 3456,
          "FAIL": 0
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "v7RIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "GT_10ms": 0,
          "TOTAL": 12,
          "FailNor": 0,
          "agent": {
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat"
          },
          "SUMVAL": 0,
          "MAX_RECORD": 7890,
          "FailErr": 0,
          "AVG_ms": 0.140833,
          "GT_100ms": 0,
          "GT_2ms": 0,
          "ecs": {
            "version": "8.0.0"
          },
          "MAX_ms": 0.167,
          "input": {
            "type": "filestream"
          },
          "host": {
            "name": "hyperion"
          },
          "MIN_ms": 0.117,
          "GT_50ms": 0,
          "NAME": "name4",
          "FAIL": 0,
          "message": "name4                               |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|",
          "log": {
            "offset": 733,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          }
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "wLRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.078Z",
          "GT_100ms": 0,
          "GT_50ms": 0,
          "GT_2ms": 0,
          "AVG_ms": 0.140833,
          "log": {
            "offset": 973,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          },
          "message": "TOTALS                              |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|",
          "input": {
            "type": "filestream"
          },
          "agent": {
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat"
          },
          "FailNor": 0,
          "TOTAL": 12,
          "GT_10ms": 0,
          "MIN_ms": 0.117,
          "NAME": "TOTALS",
          "SUMVAL": 0,
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "name": "hyperion"
          },
          "FailErr": 0,
          "FAIL": 0,
          "MAX_RECORD": 7890,
          "MAX_ms": 0.167
        }
      }
    ]
  }
}

:heart_eyes: It's a great start for me!!! Thanks su much.

Can i add second dissect for single type log?
There are two such formats(table) inside this log.

1 Like

Sure.. you can put more than 1 dissect they will be executed in order...

A matching will succeed and non matching will throw the parse error.

So The drop logic as is will probably not work unless you add some conditionals on the second dissect only run it when the first one fails.

Otherwise you will drop good rows when the second dissect fails because the first one already succeeded

Hope that makes sense

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.