How filebeat custom parse rules?

fansehep · December 8, 2023, 10:13am

I have a project. it will output a spiecal log like:
| xxx | xxx | xxx |
| xxx | xxx | xxx |
| xxx | xxx | xxx |
I want to know it there anyway to custom self parse rules or some plugins to help me ?

stephenb · December 8, 2023, 2:53pm

Hi @fansehep Welcome to the community!.

If your file is truly that simple, you could look at the dissect parser.

fansehep · December 10, 2023, 12:56am

It doesn't looks very simple. it has some other lines. I am not sure the dissect strings is the best way. What about the filebeat processors module? In fact I want to parse it by myself code.

stephenb · December 10, 2023, 1:58am

That is about as simple as it gets... if you provide some sample log lines and the field names we can show you.

Dissect is one of the filebeat processors.

Apologies I am not sure what you mean

fansehep · December 10, 2023, 3:29am

In fact this log file like:

===============table_tiltle=====================
                                    |   TOTAL|    FAIL| FailNor| FailErr|  SUMVAL|  AVG(ms)|  MAX(ms)|  MIN(ms)|    >2ms|   >10ms|   >50ms|  >100ms|MAX_RECORD           |
name1.                              |      12|       0|       0|       0|       0| 0.009333|    0.012|    0.007|       0|       0|       0|       0|                     |
name2                               |      12|       0|       0|       0|       0| 0.131917|    0.157|    0.108|       0|       0|       0|       0|                     |
name3                               |      12|       0|       0|       0|       0| 0.140667|    0.167|    0.117|       0|       0|       0|       0|                     |
name4.                              |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                     |
--------------------------------------------------------------------
ALL                                 | xxx    | xxx    | xxx    | ...

If the dissect can solve this problem. that's very nice!
I mean is here some way to custom the processor module?

stephenb · December 10, 2023, 3:29am

That is not a normal file

Filebeat is made to read CSV, or Delimeted Files, or Log Files or IOT files

I am not sure if that "Kinda Sort of " like your file or actually the file .... with the = signs and all that... and dashes - and all

Could we parse that file ... yes probably but I would probably do it with ingest pipeline etc... not just filebeat processors...

I fixed your post... yes we can probably parse that file would be better if you could post

It looks Pipe delimited |

Is the ALL row Totals?

fansehep · December 10, 2023, 3:33am

this is the real log. it has = and |.
Yep ALL is row Totals.

etc... How parse it?

I'm confused about this, if there is any way I can add custom parsing rules in filebeat or with the help of processor ?

stephenb · December 10, 2023, 3:45am

Yes, it can be parsed (perhaps not perfectly), I think, but I will need to look at it tomorrow... I would probably use an ingest pipeline that runs in Elastic not a filebeat parser.

custom parsing in filebeat uses processors... it is not 2 different things...

fansehep · December 10, 2023, 4:06am

Thanks for your reply. I hope you can help me with this.

stephenb · December 10, 2023, 5:54am

Here is my code. Of course since this was not a simple direct / normal file it took a bit more. See if you can follow what I did.. You may need to make adjustments if your file is different.

My Log File

===============table_tiltle=====================
                                    |   TOTAL|    FAIL| FailNor| FailErr|  SUMVAL|  AVG(ms)|  MAX(ms)|  MIN(ms)|    >2ms|   >10ms|   >50ms|  >100ms|MAX_RECORD           |
name1                               |      12|       0|       0|       0|       0| 0.009333|    0.012|    0.007|       0|       0|       0|       0|                 1234|
name2                               |      12|       0|       0|       0|       0| 0.131917|    0.157|    0.108|       0|       0|       0|       0|                 2345|
name3                               |      12|       0|       0|       0|       0| 0.140667|    0.167|    0.117|       0|       0|       0|       0|                 3456|
name4                               |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|
--------------------------------------------------------------------
TOTALS                              |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|

My filebeat.yml

filebeat.inputs:
- type: filestream
  id: my-filestream-id
  enabled: true
  paths:
    - /Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log

setup.kibana:

output.elasticsearch:
  hosts: ["localhost:9200"]

processors:

  # Dissect the rows 
  - dissect:
      tokenizer: "%{NAME}|%{TOTAL}|%{FAIL}|%{FailNor}|%{FailErr}|%{SUMVAL}|%{AVG_ms}|%{MAX_ms}|%{MIN_ms}|%{GT_2ms}|%{GT_10ms}|%{GT_50ms}|%{GT_100ms}|%{MAX_RECORD}|"
      field: "message"
      target_prefix: ""
      trim_values: "all"

  # Drop the bad rows    
  - drop_event:
      when:
        or:
          - contains:
              log.flags: "dissect_parsing_error"
          - equals:
              NAME: ""

  # Convert the fields             
  - convert:
      fields:
        - {from: "TOTAL",   type: "integer"}
        - {from: "FAIL",    type: "integer"}
        - {from: "FailNor", type: "integer"}
        - {from: "FailErr", type: "integer"}
        - {from: "SUMVAL",  type: "integer"}
        - {from: "AVG_ms",  type: "float"}
        - {from: "MAX_ms",  type: "float"}
        - {from: "MIN_ms",  type: "float"}
        - {from: "GT_2ms",  type: "integer"}
        - {from: "GT_10ms", type: "integer"}
        - {from: "GT_10ms", type: "integer"}
        - {from: "GT_50ms", type: "integer"}
        - {from: "GT_10ms", type: "integer"}
        - {from: "GT_100ms",   type: "integer"}
        - {from: "MAX_RECORD", type: "integer"}
      ignore_missing: true
      fail_on_error: false

Results

GET filebeat*/_search

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 5,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "vLRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "AVG_ms": 0.009333,
          "NAME": "name1",
          "TOTAL": 12,
          "message": "name1                               |      12|       0|       0|       0|       0| 0.009333|    0.012|    0.007|       0|       0|       0|       0|                 1234|",
          "input": {
            "type": "filestream"
          },
          "ecs": {
            "version": "8.0.0"
          },
          "FailNor": 0,
          "GT_50ms": 0,
          "MAX_RECORD": 1234,
          "log": {
            "offset": 220,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          },
          "agent": {
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat",
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853"
          },
          "GT_10ms": 0,
          "MIN_ms": 0.007,
          "SUMVAL": 0,
          "FAIL": 0,
          "FailErr": 0,
          "host": {
            "name": "hyperion"
          },
          "GT_100ms": 0,
          "GT_2ms": 0,
          "MAX_ms": 0.012
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "vbRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "input": {
            "type": "filestream"
          },
          "FailNor": 0,
          "AVG_ms": 0.131917,
          "SUMVAL": 0,
          "GT_100ms": 0,
          "MAX_ms": 0.157,
          "host": {
            "name": "hyperion"
          },
          "MAX_RECORD": 2345,
          "GT_2ms": 0,
          "NAME": "name2",
          "GT_10ms": 0,
          "log": {
            "offset": 391,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          },
          "FAIL": 0,
          "MIN_ms": 0.108,
          "message": "name2                               |      12|       0|       0|       0|       0| 0.131917|    0.157|    0.108|       0|       0|       0|       0|                 2345|",
          "ecs": {
            "version": "8.0.0"
          },
          "agent": {
            "name": "hyperion",
            "type": "filebeat",
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85"
          },
          "FailErr": 0,
          "TOTAL": 12,
          "GT_50ms": 0
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "vrRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "GT_2ms": 0,
          "MAX_ms": 0.167,
          "log": {
            "offset": 562,
            "file": {
              "inode": "96220320",
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221"
            }
          },
          "message": "name3                               |      12|       0|       0|       0|       0| 0.140667|    0.167|    0.117|       0|       0|       0|       0|                 3456|",
          "input": {
            "type": "filestream"
          },
          "host": {
            "name": "hyperion"
          },
          "GT_100ms": 0,
          "TOTAL": 12,
          "ecs": {
            "version": "8.0.0"
          },
          "SUMVAL": 0,
          "NAME": "name3",
          "FailErr": 0,
          "GT_50ms": 0,
          "FailNor": 0,
          "AVG_ms": 0.140667,
          "agent": {
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat",
            "version": "8.11.1"
          },
          "MIN_ms": 0.117,
          "GT_10ms": 0,
          "MAX_RECORD": 3456,
          "FAIL": 0
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "v7RIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.077Z",
          "GT_10ms": 0,
          "TOTAL": 12,
          "FailNor": 0,
          "agent": {
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat"
          },
          "SUMVAL": 0,
          "MAX_RECORD": 7890,
          "FailErr": 0,
          "AVG_ms": 0.140833,
          "GT_100ms": 0,
          "GT_2ms": 0,
          "ecs": {
            "version": "8.0.0"
          },
          "MAX_ms": 0.167,
          "input": {
            "type": "filestream"
          },
          "host": {
            "name": "hyperion"
          },
          "MIN_ms": 0.117,
          "GT_50ms": 0,
          "NAME": "name4",
          "FAIL": 0,
          "message": "name4                               |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|",
          "log": {
            "offset": 733,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          }
        }
      },
      {
        "_index": ".ds-filebeat-8.11.1-2023.12.10-000001",
        "_id": "wLRIUowBpwIAo0SDwohh",
        "_score": 1,
        "_source": {
          "@timestamp": "2023-12-10T05:52:28.078Z",
          "GT_100ms": 0,
          "GT_50ms": 0,
          "GT_2ms": 0,
          "AVG_ms": 0.140833,
          "log": {
            "offset": 973,
            "file": {
              "path": "/Users/sbrown/workspace/sample-data/discuss/discuss-piped-sample.log",
              "device_id": "16777221",
              "inode": "96220320"
            }
          },
          "message": "TOTALS                              |      12|       0|       0|       0|       0| 0.140833|    0.167|    0.117|       0|       0|       0|       0|                 7890|",
          "input": {
            "type": "filestream"
          },
          "agent": {
            "version": "8.11.1",
            "ephemeral_id": "0ba0450f-78fd-4bfd-8d73-acb1ba720853",
            "id": "554c4632-aa9c-46b8-ab46-107d9a245b85",
            "name": "hyperion",
            "type": "filebeat"
          },
          "FailNor": 0,
          "TOTAL": 12,
          "GT_10ms": 0,
          "MIN_ms": 0.117,
          "NAME": "TOTALS",
          "SUMVAL": 0,
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "name": "hyperion"
          },
          "FailErr": 0,
          "FAIL": 0,
          "MAX_RECORD": 7890,
          "MAX_ms": 0.167
        }
      }
    ]
  }
}

fansehep · December 10, 2023, 6:30am

It's a great start for me!!! Thanks su much.

Can i add second dissect for single type log?
There are two such formats(table) inside this log.

stephenb · December 10, 2023, 6:42am

Sure.. you can put more than 1 dissect they will be executed in order...

A matching will succeed and non matching will throw the parse error.

So The drop logic as is will probably not work unless you add some conditionals on the second dissect only run it when the first one fails.

Otherwise you will drop good rows when the second dissect fails because the first one already succeeded

Hope that makes sense

system · January 7, 2024, 8:42am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat dissect Beats filebeat	1	1087	February 11, 2021
How to customize a filebeat to parse a log? Beats filebeat	3	12342	October 16, 2020
Filebeat - Dissect Message String Beats filebeat	4	3292	November 9, 2020
About FileBeat Dissect processor Beats filebeat	2	929	October 28, 2022
Parse Exchange Message tracking logs via Dissect filebeat processor Beats filebeat	9	2602	February 1, 2022

How filebeat custom parse rules?

Results

Related topics