Beats encodes angle brackets ("<" and ">") as \u003c and \u003e in JSON output

I'm in the process of setting up filebeat to ship out logs on some of my servers. One of my logs has angle brackets in it ("<" and ">"). Once filebeat processes it and outputs its JSON representation, those angle brackets have been replaced with \u003c and \u003e, respectively.

Sample log:

Sep 15 17:49:02 [26263] <warning>  [rest of log omitted]

JSON output:

{
   "@timestamp":"2016-09-16T01:06:24.394Z",
   "beat":{
      "hostname":"[redacted]",
      "name":"[redacted]"
   },
   "input_type":"log",
   "message":"Sep 15 17:49:02 [26263] \u003cwarning\u003e  [rest of log omitted]",
   "offset":2229,
   "source":"[redacted]"
}

I believe this has to do with Go's JSON library, which encodes like this by default. More info: https://golang.org/pkg/encoding/json/#Marshal

Yes, that's right. Go's JSON marshaller does this by default. Go 1.7 introduces an option to disable this behavior. Is the current behavior a problem?

I believe that this behavior is a problem. IMO, filebeat should be putting the exact log message into the JSON envelope, only changing data necessary for that encapsulation (e.g. escaping double quotation marks). Angle brackets can go into a JSON object as-is.

Philosophical musings aside, in what way is the current behavior a problem?

In my case, I'm using a JSON parser that does not interpret unicode escape sequences. Logs that include something like "\u003c" in them show up as "u003c" once de-JSONified. I assume it is interpreting "\u" as a literal "u".

I've done a bit of research and I've since discovered that many JSON parsers do interpret these unicode escape sequences, so this may not be a problem for everyone.

Such escape sequences are part of the JSON standard so I'd expect any JSON parser to support them. That said, the raison d'être for escaping angle brackets,

to keep some browsers from misinterpreting JSON output as HTML

hardly applies in the Beats case so now that Go 1.7 is out and Beats is already using it I think it's reasonable to disable that kind of escaping. I filed Disable useless HTML escaping when marshaling JSON · Issue #2581 · elastic/beats · GitHub for this.

@magnusbaeck Thanks for opening the issue.

This topic was automatically closed after 21 days. New replies are no longer allowed.