Filebeat escaping backslashes on its own


#1

I have a file with the following entry logs. Some Java stack traces with \n instead of break lines, so it's only one line per log entry

2016-05-30 11:25:53,907 WARN 4209 [DiscoveryClient-InstanceInfoReplicator-0] com.netflix.discovery.InstanceInfoReplicator : There was a problem with the instance info replicator {} com.netflix.discovery.shared.transport.TransportException: Cannot execute request on any known server\n    at com.netflix.discovery.shared.transport.decorator.RetryableEurekaHttpClient.execute(RetryableEurekaHttpClient.java:111)\n     at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.register(EurekaHttpClientDecorator.java:56)\n     at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator$1.execute(EurekaHttpClientDecorator.java:59)\n    at com.netflix.discovery.shared.transport.decorator.SessionedEurekaHttpClient.execute(SessionedEurekaHttpClient.java:77)\n      at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.register(EurekaHttpClientDecorator.java:56)\n     at com.netflix.discovery.DiscoveryClient.register(DiscoveryClient.java:806)\n   at com.netflix.discovery.InstanceInfoReplicator.run(InstanceInfoReplicator.java:103)\n  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n    at java.util.concurrent.FutureTask.run(FutureTask.java:266)\n   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n    at java.lang.Thread.run(Thread.java:745)\n

If I tell filebeat to watch this file and send it to logstash (using debug for verbosity), I can see that the break lines are being escaped, so Logstash (and then ElasticSearch) receives the character \\n instead of \n:

2016/06/15 16:32:40.250503 publish.go:112: DBG  Publish: {
  "@timestamp": "2016-06-15T16:32:35.218Z",
  "beat": {
    "hostname": "225f1d5b0641",
    "name": "225f1d5b0641"
  },
  "count": 1,
  "fields": {
    "type": "app"
  },
  "input_type": "log",
  "message": "2016-05-30 11:25:53,907 WARN 4209 [DiscoveryClient-InstanceInfoReplicator-0] com.netflix.discovery.InstanceInfoReplicator : There was a problem with the instance info replicator {} com.netflix.discovery.shared.transport.TransportException: Cannot execute request on any known server\\n\tat com.netflix.discovery.shared.transport.decorator.RetryableEurekaHttpClient.execute(RetryableEurekaHttpClient.java:111)\\n\tat com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.register(EurekaHttpClientDecorator.java:56)\\n\tat com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator$1.execute(EurekaHttpClientDecorator.java:59)\\n\tat com.netflix.discovery.shared.transport.decorator.SessionedEurekaHttpClient.execute(SessionedEurekaHttpClient.java:77)\\n\tat com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.register(EurekaHttpClientDecorator.java:56)\\n\tat com.netflix.discovery.DiscoveryClient.register(DiscoveryClient.java:806)\\n\tat com.netflix.discovery.InstanceInfoReplicator.run(InstanceInfoReplicator.java:103)\\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\\n\tat java.lang.Thread.run(Thread.java:745)\\n",
  "offset": 101843,
  "source": "/tmp/app.log",
  "tags": [
  ],
  "type": "log"
}

This is a problem because it's really hard to read one-line stack traces on Kibana.
Is there a way to tell filebeat to avoid escaping this?

Thank you


#2

This is my filebeat configuration

filebeat:
  prospectors:
    -
      paths:
        - "/tmp/access.log"

      encoding: utf-8
      input_type: log
      fields:
        type: access
    -
      paths:
        - "/tmp/*.log"

      encoding: utf-8
      input_type: log
      exclude_files: ["access.log"]
      fields:
        type: app

  config_dir:

output:
  logstash:
    enabled: true
    hosts: ["192.168.99.100:5001"]
    index: logstash

shipper:
  tags: []

logging:
  to_syslog: false
  to_files: false

(Magnus Bäck) #3

This is a problem because it's really hard to read one-line stack traces on Kibana.

I'm not sure exactly what you mean. Your logs contain \n rather than an actual newline character, yes? And you expect those escape sequences to be transformed into the characters they represent so that the stack trace becomes readable?

Is there a way to tell filebeat to avoid escaping this?

Your example shows a JSON representation of the event that Filebeat sends to Logstash. The JSON specification requires that backslashes are escaped. When Logstash receives this the JSON string will be deserialized back to \n and the \t that follows \\n in the JSON example will be deserialized to an actual tab character.


#4

Thank you for your reply.

Yes, you got it. I see the \n' character literally on Kibana, instead of the actual line break. If I directly query ElasticSearch, I can see that the\ncharacter is saved, instead of\n`, so it's like Logstash is not deserializing it?

This is my Logstash configuration

input {
  beats {
    port => 5001
  }
}
filter {
    grok {
            match => { "message" => "%{TIMESTAMP_ISO8601:date} %{LOGLEVEL:priority} %{NUMBER:pid} \[%{DATA:thread}\] %{DATA:logger} : %{GREEDYDATA:message})\s+\{%{DATA:mdc}\}\s*(%{GREEDYDATA:stacktrace})" }
            overwrite => [ "message" ]
            add_tag => [ "app_log" ]
    }
}

(Magnus Bäck) #5

Logstash does not turn \n sequences into newline characters on its own, so if your logs really contain \n sequences you need to put in a filter to transform them. Logstash's configuration language has some issues with escaping so I'd be surprised if you'll be able to use the mutate filter's gsub option for this, but you can do it with a ruby filter.


#6

Thanks again.

Any link where to find out how to go about this ruby filter?


#7

With

ruby {
                code => "
                     temp = event['stacktrace']
                     event['stacktrace'] = temp.gsub(/\\n/,'\n')
                "
}

I get

Cannot execute request on any known server\\n\tat com.netflix.discovery.s...

But with

ruby {
                code => "
                     temp = event['stacktrace']
                     event['stacktrace'] = temp.gsub(/\\n/,'n')
                "
}

I get

Cannot execute request on any known servern\tat com.netflix.discovery.s...

I don't understand :confounded:


(Magnus Bäck) #8

Yeah, this might be tricky given how Logstash deals with string escaping. Try this:

event['stacktrace'] = event['stacktrace'].gsub(92.chr + 'n', 10.chr)

(You don't need a temporary variable.)


#9

It works!

Thank you for all your help @magnusbaeck


(system) #10

This topic was automatically closed after 21 days. New replies are no longer allowed.