Parsing json logs using logstash

Hi I want to parse the json logs using logstash and send them to elastic .There are multiple nested fields in my logs but I want very specific fields for eg , here is my log format :

{
  "_index": "ekslogs-2021.05.27",
  "_type": "doc",
  "_id": "3Y6zrnkBzzvO6GYmqdMv",
  "_version": 1,
  "_score": null,
  "_source": {
"kubernetes": {
  "namespace": "inventory-mgmt",
  "replicaset": {
    "name": "public-service-5659649fc7"
  },
  "labels": {
    "app_kubernetes_io/routing": "NLB",
    "app_kubernetes_io/instance": "mus",
    "app_kubernetes_io/managed-by": "Helm",
    "app_kubernetes_io/version": "191e179_51",
    "helm_sh/chart": "public-service-191e179_51",
    "app_kubernetes_io/component": "microservice",
    "app_kubernetes_io/part-of": "public-service-management",
    "pod-template-hash": "5659649fc7",
    "repo": "public-service",
    "app_kubernetes_io/name": "public-service",
    "app": "mus"
  },
  "pod": {
    "name": "public-service-909090-phj9w",
    "uid": "64bd12bd-d07a-4ac2-9409-ac8fc703978e"
  },
  "node": {
    "name": "ip-10-63-21-989.ec2.internal"
  },
  "container": {
    "name": "public-service"
  }
},
"message": "{\"@message\":\"User not Authorized\",\"@timestamp\":\"2021-05-27T16:41:17.979Z\",\"@fields\":{\"level\":\"error\",\"context\":{\"code\":401,\"errorCode\":null,\"stack\":\"Authorization header is required.\",\"oStack\":null,\"innerMessage\":null,\"serviceName\":\"public-public-service\"},\"host\":\"public-service-5659649fc7-phj9w\",\"x-correlation-id\":\"d9ajd9hd9a-bf0a-11eb-b16b-d9dd5db88e32\"}}",
"stream": "stdout",
"log": {
  "offset": 33501,
  "file": {
    "path": "/var/lib/docker/containers/ab0e00a90dca71ef4fbf4d7e8aaa3fa711c723c05d6c52b2ffb1c22ed49ad3a4/ab0e00a90dca71ef4fbf4d7e8aaa3fa711c723c05d6c52b2ffb1c22ed49ad3a4-json.log"
  }
},
"cloud": {
  "availability_zone": "us-east-1b",
  "provider": "aws",
  "instance": {
    "id": "i-6565695695j95n9n65"
  },
  "machine": {
    "type": "m5.2xlarge"
  },
  "region": "us-east-1"
},
"ecs": {
  "version": "1.0.0"
},
"@version": "1",
"tags": [
  "beats_input_codec_plain_applied"
],
"input": {
  "type": "docker"
},
"@timestamp": "2021-05-27T16:41:17.979Z",
"agent": {
  "type": "filebeat",
  "id": "0000000-fcd1-4c6c-9e13-5645454",
  "version": "7.0.1",
  "hostname": "eks-filebeats-wzgkr",
  "ephemeral_id": "54185a8b-65a9-4c04-9a92-67b1d3ebbb7a"
},
"host": {
  "name": "eks-filebeats-wzgkr"
}
  },
  "fields": {
"@timestamp": [
  "2021-05-27T16:41:17.979Z"
]
  },
  "highlight": {
"kubernetes.namespace": [
  "@kibana-highlighted-field@inventory@/kibana-highlighted-field@-@kibana-highlighted-field@mgmt@/kibana-highlighted-field@"
],
"kubernetes.pod.name": [
  "@kibana-highlighted-field@public@/kibana-highlighted-field@-@kibana-highlighted-field@service@/kibana-highlighted-field@-5659649fc7-phj9w"
]
  },
  "sort": [
1622133677979
  ]
}

The output I want is something like :
kubernetes
labels
pod
node
container
cloud.instance_id
machine
message (This is the field which contains the application logs in nested format json, I want to extract the values in here as well and bring it up to the root level.)
timestamp

Can anyone help here ??Preformatted text

Please edit your post, select the log and click on </> in the toolbar above the edit panel. That will change the format from

"_score": null,
"_source": {
"kubernetes": {
"namespace": "inventory-mgmt",

to

"_score": null,
"_source": {
    "kubernetes": {
        "namespace": "inventory-mgmt",

and prevent parts of the message being consumed as markup.

@Badger edit that message

It is not clear what you mean by this. Assuming that the data you show is a document from elasticsearch, what logstash will see is the contents of source. If you process that with

    prune { whitelist_names => [ "cloud", "kubernetes", "message" ] }
    json { source => "message" remove_field => [ "message" ] }

it will remove all the fields that you seem not to want. I whitelist [cloud] to keep [cloud][machine] and [cloud][instance][id], and I whitelist [kubernetes] to keep the pod, labels, node and container sub-fields.

If you want to move these to the top level you could do something like

mutate {
    rename => {
        "[kubernetes][pod]" => "pod"
        "[kubernetes][labels]" => "labels"
        "[kubernetes][node][name]" => "node"
        "[kubernetes][container][name]" => "container"
    }
    # Still contains namespace, node, replicaset, and container fields
    remove_field => [ "kubernetes" ]

}

Similarly for the other fields you mentioned. The above would give you

{
       "pod" => {
    "name" => "public-service-909090-phj9w",
     "uid" => "64bd12bd-d07a-4ac2-9409-ac8fc703978e"
},
      "node" => "ip-10-63-21-989.ec2.internal",
   "@fields" => {
                "host" => "public-service-5659649fc7-phj9w",
    "x-correlation-id" => "d9ajd9hd9a-bf0a-11eb-b16b-d9dd5db88e32",
               "level" => "error",
             "context" => {
                "code" => 401,
        "innerMessage" => nil,
           "errorCode" => nil,
         "serviceName" => "public-public-service",
               "stack" => "Authorization header is required.",
              "oStack" => nil
    }
},
     "cloud" => {
             "provider" => "aws",
             "instance" => {
        "id" => "i-6565695695j95n9n65"
    },
    "availability_zone" => "us-east-1b",
              "machine" => {
        "type" => "m5.2xlarge"
    },
               "region" => "us-east-1"
},
  "@message" => "User not Authorized",
    "labels" => {
                   "helm_sh/chart" => "public-service-191e179_51",
                            "repo" => "public-service",
       "app_kubernetes_io/routing" => "NLB",
      "app_kubernetes_io/instance" => "mus",
          "app_kubernetes_io/name" => "public-service",
    "app_kubernetes_io/managed-by" => "Helm",
     "app_kubernetes_io/component" => "microservice",
                             "app" => "mus",
               "pod-template-hash" => "5659649fc7",
       "app_kubernetes_io/version" => "191e179_51",
       "app_kubernetes_io/part-of" => "public-service-management"
},
"@timestamp" => 2021-05-27T16:41:17.979Z,
 "container" => "public-service"
}

@Badger sorry I did not explained it rightly . You are right that this is what elastic sees and not the source .
I have a kubernetes cluster where the hosted services are writing the logs in stdout.
I have configured filebeat to ship the logs from stdout to logstash.
Logstash (without any filter) is sending out the logs to elastic.
So the filebeat is sending the kubernetes logs along with the application logs as well.
This set-up is working without any errors now I need to parse these logs so that I can apply proper filters and can make dashboards.

Here is one more sample :

{
  "_index": "ekslogs-2021.05.27",
  "_type": "doc",
  "_id": "3Y6zrnkBzzvO6GYmqdMv",
  "_version": 1,
  "_score": null,
  "_source": {
    "kubernetes": {
      "namespace": "inventory-mgmt",
      "replicaset": {
        "name": "public-service-5659649fc7"
      },
      "labels": {
        "app_kubernetes_io/routing": "NLB",
        "app_kubernetes_io/instance": "mus",
        "app_kubernetes_io/managed-by": "Helm",
        "app_kubernetes_io/version": "191e179_51",
        "helm_sh/chart": "public-service-191e179_51",
        "app_kubernetes_io/component": "microservice",
        "app_kubernetes_io/part-of": "public-service-management",
        "pod-template-hash": "5659649fc7",
        "repo": "public-service",
        "app_kubernetes_io/name": "public-service",
        "app": "mus"
      },
      "pod": {
        "name": "public-service-909090-phj9w",
        "uid": "64bd12bd-d07a-4ac2-9409-ac8fc703978e"
      },
      "node": {
        "name": "ip-10-63-21-989.ec2.internal"
      },
      "container": {
        "name": "public-service"
      }
    },
    "message": "**{\"@message\":\"User not Authorized\",\"@timestamp\":\"2021-05-27T16:41:17.979Z\",\"@fields\":{\"level\":\"error\",\"context\":{\"code\":401,\"errorCode\":null,\"stack\":\"Authorization header is required.\",\"oStack\":null,\"innerMessage\":null,\"serviceName\":\"mosaic-upload-service\"},\"host\":\"upload-service-5659649fc7-phj9w\",\"x-correlation-id\":\"5c0bd6b0-bf0a-11eb-b16b-d9dd5db88e32\"}}"**,
    "stream": "stdout",
    "log": {
      "offset": 33501,
      "file": {
        "path": "/var/lib/docker/containers/ab0e00a90dca71ef4fbf4d7e8aaa3fa711c723c05d6c52b2ffb1c22ed49ad3a4/ab0e00a90dca71ef4fbf4d7e8aaa3fa711c723c05d6c52b2ffb1c22ed49ad3a4-json.log"
      }
    },
    "cloud": {
      "availability_zone": "us-east-1b",
      "provider": "aws",
      "instance": {
        "id": "i-6565695695j95n9n65"
      },
      "machine": {
        "type": "m5.2xlarge"
      },
      "region": "us-east-1"
    },
    "ecs": {
      "version": "1.0.0"
    },
    "@version": "1",
    "tags": [
      "beats_input_codec_plain_applied"
    ],
    "input": {
      "type": "docker"
    },
    "@timestamp": "2021-05-27T16:41:17.979Z",
    "agent": {
      "type": "filebeat",
      "id": "0000000-fcd1-4c6c-9e13-5645454",
      "version": "7.0.1",
      "hostname": "eks-filebeats-wzgkr",
      "ephemeral_id": "54185a8b-65a9-4c04-9a92-67b1d3ebbb7a"
    },
    "host": {
      "name": "eks-filebeats-wzgkr"
    }
  },
  "fields": {
    "@timestamp": [
      "2021-05-27T16:41:17.979Z"
    ]
  },
  "highlight": {
    "kubernetes.namespace": [
      "@kibana-highlighted-field@inventory@/kibana-highlighted-field@-@kibana-highlighted-field@mgmt@/kibana-highlighted-field@"
    ],
    "kubernetes.pod.name": [
      "@kibana-highlighted-field@public@/kibana-highlighted-field@-@kibana-highlighted-field@service@/kibana-highlighted-field@-5659649fc7-phj9w"
    ]
  },
  "sort": [
    1622133677979
  ]
}

The below part in the above message is something I need to get (rest everything can be ignored)

message": "**{\"@message\":\"User not Authorized\",\"@timestamp\":\"2021-05-27T16:41:17.979Z\",\"@fields\":{\"level\":\"error\",\"context\":{\"code\":401,\"errorCode\":null,\"stack\":\"Authorization header is required.\",\"oStack\":null,\"innerMessage\":null,\"serviceName\":\"mosaic-upload-service\"},\"host\":\"upload-service-5659649fc7-phj9w\",\"x-correlation-id\":\"5c0bd6b0-bf0a-11eb-b16b-d9dd5db88e32\"}}"

which should look something like this as an end result :

{
	"message": "NOT_FOUND",
	"time": "2021-05-26T14:40:26.923Z",
	"level": "debug",
	"code-file": "/home/app/appfiles/src/lib/errors/handler.js",
	"error-name": "AppError",
	"status-code": 404,
	"service-identifier": "mosaic-asset-mgmt-service",
	"stacktrace": "AppError: NOT_FOUND\n    at /home/app/appfiles/src/lib/errors/handler.js:77:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7\n    at Function.process_params (/home/app/appfiles/node_modules/express/lib/router/index.js:335:12)\n    at next (/home/app/appfiles/node_modules/express/lib/router/index.js:275:10)\n    at /home/app/appfiles/node_modules/express-mung/index.js:60:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7"
	"host": "asset-management-service-767d9c9bc9-29mqt",
	"x-correlation-id": "4faa98a0-be30-11eb-85b2-71aa5a1d1842"
}

As per your above message if I apply the config something like below will it work :

input {
  file {
    ppath => "app.log"
    start_position => "beginning"
    codec => multiline {
      pattern => "^Spalanzani"
      negate => "true"
      what => "previous"
      auto_flush_interval => 1
    }
  }
}

filter{
  json {
    source => "message" 
    }

  mutate {
    rename => {
        "[@message]" => "message"
        "[@timestamp]" => "time"
        "[@fields][level]" => "level"
        "[@fields][level][context][file]" => "code-file"
        "[@fields][level][context][name]" => "error-name"
        "[@fields][level][context][status]" => "status-code"
        "[@fields][level][context][serviceName]" => "service-identifier"
        "[@fields][level][context][stack]" => "stacktrace"
    }

  }
 output { stdout { codec => rubydebug  }}

That looks right.

Let's unwrap that

input {
file {
ppath => "/Users/learnelk/Documents/logging/logstash/event-data/upload-app.log"
}
filter{ json

You are missing a } to close the input section, so it is parsing filter as if it were an input (it would find out later that no such filter exists). Then it is treating json as an option to the filter input, and options cannot be followed by {

Note also that when it gets as far as loading the file input it will complain that a file input does not have a ppath option, it should be path.

I observed that and deleted the query..working on it..will reply the result

Log event :

{
   "@message":"NOT_FOUND",
   "@timestamp":"2021-05-27T19:28:57.765Z",
   "@fields":{
      "level":"debug",
      "context":{
         "file":"/home/app/appfiles/src/lib/errors/handler.js",
         "name":"AppError",
         "status":404,
         "serviceName":"mosaic-asset-mgmt-service",
         "stack":"AppError: NOT_FOUND\n    at /home/app/appfiles/src/lib/errors/handler.js:77:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7\n    at Function.process_params (/home/app/appfiles/node_modules/express/lib/router/index.js:335:12)\n    at next (/home/app/appfiles/node_modules/express/lib/router/index.js:275:10)\n    at /home/app/appfiles/node_modules/express-mung/index.js:60:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7"
      },
      "host":"asset-management-service-767d9c9bc9-zlf26",
      "x-correlation-id":"c8255d50-bf21-11eb-b0a4-877e39250e05"
   }
}

LogstashConf :

input {
  file {
    path => "/Users/animesh/Documents/logging/logstash/event-data/upload-app.log"
    start_position => "beginning"
    codec => multiline {
      pattern => "^Spalanzani"
      negate => "true"
      what => "previous"
      auto_flush_interval => 1
    }
  }
}

filter{
  json {
    source => "message" 
    }

  mutate {
    rename => {
        "[@message]" => "message"
        "[@timestamp]" => "time"
        "[@fields][level]" => "level"
        "[@fields][level][context][file]" => "code-file"
        "[@fields][level][context][name]" => "error-name"
        "[@fields][level][context][status]" => "status-code"
        "[@fields][level][context][serviceName]" => "service-identifier"
        "[@fields][level][context][stack]" => "stacktrace"
    }

  }
}

 output { stdout { codec => rubydebug  }}

Output :

{
        "host" => "learnelk-mac.local",
        "path" => "/Users/learnelk/Documents/logging/logstash/event-data/upload-app.log",
        "time" => 2021-05-27T19:28:57.765Z,
     "@fields" => {
        "x-correlation-id" => "c8255d50-bf21-11eb-b0a4-877e39250e05",
                 "context" => {
                   "name" => "AppError",
            "serviceName" => "mosaic-asset-mgmt-service",
                 "status" => 404,
                   "file" => "/home/app/appfiles/src/lib/errors/handler.js",
                  "stack" => "AppError: NOT_FOUND\n    at /home/app/appfiles/src/lib/errors/handler.js:77:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7\n    at Function.process_params (/home/app/appfiles/node_modules/express/lib/router/index.js:335:12)\n    at next (/home/app/appfiles/node_modules/express/lib/router/index.js:275:10)\n    at /home/app/appfiles/node_modules/express-mung/index.js:60:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7"
        },
                    "host" => "asset-management-service-767d9c9bc9-zlf26"
    },
    "@version" => "1",
       "level" => "debug",
     "message" => "NOT_FOUND"
}

Did not get any error but it doesn't seem to work because the desired and expected output is something like this :

{
	"message": "NOT_FOUND",
	"time": "2021-05-26T14:40:26.923Z",
	"level": "debug",
	"code-file": "/home/app/appfiles/src/lib/errors/handler.js",
	"error-name": "AppError",
	"status-code": 404,
	"service-identifier": "mosaic-asset-mgmt-service",
	"stacktrace": "AppError: NOT_FOUND\n    at /home/app/appfiles/src/lib/errors/handler.js:77:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7\n    at Function.process_params (/home/app/appfiles/node_modules/express/lib/router/index.js:335:12)\n    at next (/home/app/appfiles/node_modules/express/lib/router/index.js:275:10)\n    at /home/app/appfiles/node_modules/express-mung/index.js:60:17\n    at Layer.handle [as handle_request] (/home/app/appfiles/node_modules/express/lib/router/layer.js:95:5)\n    at trim_prefix (/home/app/appfiles/node_modules/express/lib/router/index.js:317:13)\n    at /home/app/appfiles/node_modules/express/lib/router/index.js:284:7"
	"host": "asset-management-service-767d9c9bc9-29mqt",
	"x-correlation-id": "4faa98a0-be30-11eb-85b2-71aa5a1d1842"
}

You need to remove [level] from several of these.

1 Like

Thanks a lot @Badger . I am new to this so getting things started . My bad didnot look at the message properly , it does not have level in between.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.