Parse json as string

I'm having issues with parsing JSON as String using json filter

Sample stdout output from logstash without any filters looks like this

{
    "agent": {
        "type": "filebeat",
        "ephemeral_id": "0a8384b3-c4df-46e9-919a-8548835a37e4",
        "id": "0db0b40d-2912-4db8-a515-dc5b99254fea",
        "hostname": "XXXXXXXXX",
        "version": "7.3.0"
    },
    "kubernetes": {
        "replicaset": {
            "name": "XXXXXXXXX"
        },
        "namespace": "XXXXXXXXX",
        "node": {
            "name": "XXXXXXXXX"
        },
        "pod": {
            "name": "XXXXXXXXX",
            "uid": "3ff7aeaf-c341-11e9-b656-0a7c7841873e"
        },
        "labels": {
            "app": "XXXXXXXXXapp",
            "pod-template-hash": "3521946664"
        },
        "container": {
            "name": "XXXXXXXXXapp"
        }
    },
    "input": {
        "type": "container"
    },
    "@version": "1",
    "@timestamp": "2019-08-21T07:55:43.165Z",
    "log": {
        "offset": 771422,
        "file": {
            "path": "/var/log/containers/XXXXXXXXX-app-XXXXXXXXX.log"
        }
    },
    "host": {
        "containerized": false,
        "hostname": "XXXXXXXXX",
        "name": "XXXXXXXXX",
        "architecture": "x86_64",
        "os": {
            "name": "CentOS Linux",
            "kernel": "4.14.62-70.117.amzn2.x86_64",
            "platform": "centos",
            "codename": "Core",
            "version": "7 (Core)",
            "family": "redhat"
        }
    },
    "message": "{\"method\":\"GET\",\"path\":\"/XXXX/v1/XXXX/XXXX\",\"format\":\"json\",\"controller\":\"V1::Users::XXXXXController\",\"action\":\"show\",\"status\":200,\"duration\":7.47,\"view\":1.5,\"db\":2.74,\"time\":\"2019-08-21 07:55:43 UTC\",\"type\":\"rails\",\"environment\":\"staging\",\"host\":\"XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XX\",\"request_id\":\"XXXXXXXXX\",\"remote_ip\":\"XXXXXXXXX\",\"params\":{},\"user_id\":XXXXXXXXX,\"admin_id\":null,\"sql_queries\":\"'XXXXXXXXX'\",\"sql_queries_count\":2}",
    "ecs": {
        "version": "1.0.1"
    },
    "tags": [
        "XXXXXXXXX",
        "XXXXXXXXX",
        "XXXXXXXXX"
    ],
    "stream": "stdout",
    "cloud": {
        "account": {
            "id": "XXXXXX"
        },
        "region": "eu-west-1",
        "instance": {
            "id": "XXXXXXXX"
        },
        "availability_zone": "XXXXXXXXX",
        "image": {
            "id": "XXXXXXXXX"
        },
        "provider": "aws",
        "machine": {
            "type": "XXXXXXXXX"
        }
    }
}

When I try to specify filter
json {
source => "message"
target => "message_json"
skip_on_invalid_json => true
}

I will get this error from logstash
[2019-08-21T08:20:17,590][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-7.3.0-2019.08.21", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x6e7df113>], :response=>{"index"=>{"_index"=>"filebeat-7.3.0-2019.08.21", "_type"=>"_doc", "_id"=>"OUVDs2wBVU3WxI4vJOYO", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [message_json] of type [keyword] in document with id 'OUVDs2wBVU3WxI4vJOYO'. Preview of field's value: '{headers={sec-fetch-mode=cors, referer=https://XXXXXXX.XXXXXXX.XXXXXXX.XX/XXXXXXX/136, sec-fetch-site=same-site, x-forwarded-proto=https, accept-language=de-CH, origin=https://XXXXXXX-XXXXXXX.XXXXXXX.XXXXXXX.XX, x-forwarded-port=XXX, x-forwarded-for=XX.XX.XX.X, accept=*/*, authorization=Bearer XXXXX.XXXXXXXXXX, x-amzn-trace-id=XXXXXXXXXX, host=XXXXXXXXXX.XXXXXXXXXX.XXXXXXXXXX.XX, accept-encoding=gzip, deflate, br, user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36}, httpVersion=1.1, method=GET, level=info, query={XXXXX=139}, params={}, message=http-request-received, url=/XXXXX/v1/XXXX?XXXXX=139, timestamp=2019-08-21T08:20:11.034Z}'", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Can't get text on a START_OBJECT at 1:2067"}}}}}

My goal is to parse the String encoded JSON inside the "message" field

The good news is that the json is getting parsed, so that [message_json] is an object containing many fields. The bad news is that your index already contains some documents in which [message_json] is a string. A field in elasticsearch cannot be both.

If you have a mapping that forces message_json to be a keyword then remove it. Otherwise either rename the field or start with an empty index.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.