How to parse fields that contain commas for a JSON Formatted response?

Hi community,

I'm having somewhat of a hard time trying to correctly parse Json formatted data from an API call I'm doing using the Custom API integration within the Kibana UI.

Example of my data im trying to parse in the "message" field: (Kindly that this is an example of my data, i cannot paste the exact data as it contains sensitive info)

{"results":{"bundle”:”00”,”category”:”b”,”commit_err":null,"description”:”blabla 8,192kbps - cat - Solution:(null) - Asset:[null]”,”hostname”:”Kobus”,”status":"success","vlan”:99999}}

The response.split section I'm using within the Custom API Integration of the Kibana UI:

target: body.results
type: array
keep_parent: true
separator: ","

The processor section I'm using within the Custom API Integration of the Kibana UI:

\\

- dissect:
    tokenizer: "%{bundle},%{category},%{commit_err},%{description},%{hostname},%{status},%{vlan}"
    field: "message"
    target_prefix: "onboarding"
- replace:
    fields:
      - field: "onboarding.category"
        pattern: "\"category\"\\:"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.commit_err"
        pattern: "\"commit_err\"\\:"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.status"
        pattern: "\"status\"\\:"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.hostname"
        pattern: "\"hostname\"\\:"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.description"
        pattern: "\"description\"\\:"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.vlan"
        pattern: "\"vlan\"\\:"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.vlan"
        pattern: "\\}\\}"
        replacement: "*"
- replace:
    fields:
      - field: "onboarding.bundle"
        pattern: "{\\\"results\\\":\\{\\\"bundle\"\\:"
        replacement: "*"

Everything works fine except for any fields that contain any commas within its string, in the example of the data i try to pass as indicated above, you will notice that the "description”:”blabla 8,192kbps - cat - Solution:(null) - Asset:[null]” field has a comma just after "blabla 8,", now what happens is that anything before the comma gets parsed into the onboarding.description field and anything after the comma gets passed into the onboarding.hostname field and the correct data that is suppose to be in the onboarding.hostname field gets parsed into the onboarding.status field with the remaining values being parsed into the onboarding.vlan field.

So basically, the processor configuration that i indicated above works perfectly fine when i ingest json formatted data from an api call so long as the fields do not contain any commas within its string value.

Any suggestions on how to modify my processor configuration to accommodate for fields that contain commas within its string value?

Hello Bradley,

Why are you trying to split the message when it is correct json that can be parsed? I haven't tried the custom API integration, but the beats support json parsing. Here is an example of the httpjson input on which the custom api integration seems to be based:

filebeat.inputs:
# Fetch your public IP every minute.
- type: httpjson
  interval: 1m
  request.url: https://api.ipify.org/?format=json
  processors:
    - decode_json_fields:
        fields: ["message"]
        target: "json"

This means, it should work if you just add the decode_json_fields processor to your integration.

Best regards
Wolfram

Hi Wolfram,

This works a treat, thanks a million! :slight_smile: