How to convert field with "null" string value to 0 integer

Hi,

I am receiving the below error whilst trying to use filebeat to read log. I know that I can explicitly exclude the field %{processID} by appending ? %{?processID}, but sometimes the processID is valid integer and process successfully. As the code is out of my control I am unable to update it at source.

Error in log:

2021-02-13T14:06:57.055+0200	DEBUG	[processors]	processing/processors.go:112	Fail to apply processor client{rename=[{From:message To:rawMessage}], dissect=Host=%{hostname} %{date} %{time},%{milliseconds} (%{processId}) [%{threadId}] %{log.level->} %{appId} - %{message},field=rawMessage,target_prefix=, convert={"Fields":[{"From":"processId","To":"","Type":"integer"},{"From":"threadId","To":"","Type":"integer"}],"Tag":"","IgnoreMissing":true,"FailOnError":true,"Mode":"copy"}, add_locale=[format=offset], script=[type=javascript, id=, sources=inline.js], timestamp=[field=sourceTimestamp, target_field=@timestamp, timezone=UTC, layouts=[2006-01-02T15:04:05.000 -07:00]], drop_fields={"Fields":["hostname","date","time","event.timezone","sourceTimestamp","rawMessage"],"IgnoreMissing":true}}: failed in processor.convert: conversion of field [processId] to type [integer] failed: unable to convert value [null]: strconv.ParseInt: parsing "null": invalid syntax

Log Entry Example:

Host=ZAUSDCMAPP0185 2021-02-13 14:16:00,824 (null) [5] INFO  AISFWKService.Program - Run Main Loop, threads running : 0

Processor extract from filebeat.yml

processors:

    - rename:

        fields:

          - from: "message"

            to: "rawMessage"

    - dissect:

        tokenizer: "Host=%{hostname} %{date} %{time},%{milliseconds} (%{processId}) [%{threadId}] %{log.level->} %{appId} - %{message}"

        field: "rawMessage"

        target_prefix: ''

    

    - convert:

        fields:

          - {

            from: "processId", 

            type: "integer",

            on_failure: [

              {

               set: {

                field: "processId",

                value: 0

               }

              }

            ]

            }

          - {from: "threadId", type: "integer"}

        ignore_missing: true

        fail_on_error: true

            

    - add_locale: ~

    - script:

        lang: javascript

        id: my_filter

        source: >

          function process(event) {

            event.Put("sourceTimestamp", event.Get("date") + "T" + event.Get("time") + "." + event.Get("milliseconds") + " " + event.Get("event.timezone") );

          }

    - timestamp:

        field: "sourceTimestamp"

        layouts:

          - '2006-01-02T15:04:05.000 -07:00'

    - drop_fields:

        fields: ["hostname", "date", "time", "event.timezone", "sourceTimestamp", "rawMessage"]

        ignore_missing: true

Please can you assist?

Hi @Die_Meester Welcome to the community.

I think you may perhaps be mixing beats processors with ingest processors (not uncommon as they are similar) I don't think the on_failure is part of the beats processors syntax, its part of the ingest processors

For what you are doing, perhaps consider using an ingest pipeline, they are a little more flexible / powerful and they are centralized / live in the elasticsearch cluster so if you change update it you do not need to reload all the filebeat.yml etc. API Here

You will tell filebeat to use your ingest pipeline by specifying it in the output seaction

output.elasticsearch:
  hosts: ["http://localhost:9200"]
  pipeline: my_pipeline_id

I did not do it all for you but take a look at this...

PUT _ingest/pipeline/my-pipeline
{
  "processors": [
    {
      "dissect": {
        "field": "message",
        "pattern": "Host=%{hostname} %{date} %{time},%{milliseconds} (%{processId}) [%{threadId}] %{log.level->} %{appId} - %{message}"
      }
    },
    {
      "convert": {
        "field": "processId",
        "type": "integer",
        "on_failure": [
          {
            "set": {
              "field": "processId",
              "value": 0
            }
          }
        ]
      }
    },
    {
      "set": {
        "field": "date_times",
        "value": "{{date}}T{{time}}.{{milliseconds}}Z"
      }
    },
    {
      "date": {
        "field": "date_times",
        "target_field": "@timestamp",
        "formats": ["date_optional_time||strict_date_optional_time"],
        "ignore_failure": true
      }
    }
  ]
}

POST /_ingest/pipeline/my-pipeline/_simulate
{
  "docs": [
    {
      "_source": {
        "message": "Host=ZAUSDCMAPP0185 2021-02-13 14:16:00,824 (null) [5] INFO  AISFWKService.Program - Run Main Loop, threads running : 0"
      }
    },
    {
      "_source": {
        "message": "Host=ZAUSDCMAPP0185 2021-02-13 14:16:00,824 (1234) [5] INFO  AISFWKService.Program - Run Main Loop, threads running : 0"
      }
    }
  ]
}

Output

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "date" : "2021-02-13",
          "milliseconds" : "824",
          "log" : {
            "level" : "INFO"
          },
          "message" : "Run Main Loop, threads running : 0",
          "threadId" : "5",
          "hostname" : "ZAUSDCMAPP0185",
          "@timestamp" : "2021-02-13T14:16:00.824Z",
          "processId" : 0,
          "appId" : "AISFWKService.Program",
          "date_times" : "2021-02-13T14:16:00.824Z",
          "time" : "14:16:00"
        },
        "_ingest" : {
          "timestamp" : "2021-02-14T02:45:32.845381Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "date" : "2021-02-13",
          "milliseconds" : "824",
          "log" : {
            "level" : "INFO"
          },
          "message" : "Run Main Loop, threads running : 0",
          "threadId" : "5",
          "hostname" : "ZAUSDCMAPP0185",
          "@timestamp" : "2021-02-13T14:16:00.824Z",
          "processId" : 1234,
          "appId" : "AISFWKService.Program",
          "date_times" : "2021-02-13T14:16:00.824Z",
          "time" : "14:16:00"
        },
        "_ingest" : {
          "timestamp" : "2021-02-14T02:45:32.845386Z"
        }
      }
    }
  ]
}