I'm in the process of setting up filebeat to ship out logs on some of my servers. One of my logs has angle brackets in it ("<" and ">"). Once filebeat processes it and outputs its JSON representation, those angle brackets have been replaced with \u003c and \u003e, respectively.
Sample log:
Sep 15 17:49:02 [26263] <warning> [rest of log omitted]
I believe that this behavior is a problem. IMO, filebeat should be putting the exact log message into the JSON envelope, only changing data necessary for that encapsulation (e.g. escaping double quotation marks). Angle brackets can go into a JSON object as-is.
In my case, I'm using a JSON parser that does not interpret unicode escape sequences. Logs that include something like "\u003c" in them show up as "u003c" once de-JSONified. I assume it is interpreting "\u" as a literal "u".
I've done a bit of research and I've since discovered that many JSON parsers do interpret these unicode escape sequences, so this may not be a problem for everyone.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.