Hey all,
I have been processing stringified JSON using the json
filter plugin for some time:
json {
skip_on_invalid_json => true
source => "data"
}
The JSON being parsed is dynamically generated by another system. Occasionally, I run into the case where the stringified JSON contains keys that match the Field Reference Grammar. (e.g. {"some[key]": "value"}
). While not ideal, this is technically valid JSON. Unfortunately, since the json
filter plugin uses event.set
, these keys will be interpreted as fieldNames
when the event is updated with the decoded JSON.
Prior to 7.0.0
, a field reference to something like some[key]
would give a warning, but the JSON would still be decoded and the subsequent object manipulated gracefully. Now, withSTRICT
as the only available Field Reference grammar option in 7.0.0
, a value such as some[key]
is illegal and results in all-out pipeline failure.
I must solve this problem from logstash as I have no control of the incoming data. My possible solutions were to:
- Attempt to sanitize the encoded JSON string prior to decoding (slow)
- Use a
ruby
filter or custom plugin to sanitize the decoded json prior to applying it to the event (faster than 1, but more to maintain) - Look into updating the
json
filter to account for this collision
At the end of the day, I am trying to decode a valid JSON string, so 1 and 2 seem pretty hacky for a common use case. At first glance, I don't think 3 can be accomplished without also extending the event API (or avoiding it which is also not a good idea).
Is there a simple solution that I am missing? Do I have any other options to decode a JSON string without keys accidentally getting interpreted by the Field Reference Grammar? I still need access to elements in the parent object, is there any way I can keep those around and use the json codec plugin?
Any input would be appreciated! Thanks!