Grok parse failure with java logs

I'm having quite a bit of trouble with parsing some java logs. The log output format is the standard java logging format, matching the pattern defined in %{JAVALOGMESSAGE}.

The application is running on jboss inside a docker container. Output is sent to stdout on the container, collected by Logspout, sent to Redis using a Logspout plugin that sends output in Logstash format, and then my logstash indexer retrieves it from there.

Without any parsing applied, I am seeing my message field content come out like this in Kibana, according to the raw JSON tab:

\u001b[0m\u001b[0m20:58:07,382 INFO [thingdoer.thing2] (default task-28) Doingthings.getthing(126822)

Which appears as the following, when viewed from the default table tab in Kibana:

[0m[0m20:58:07,382 INFO [thingdoer.thing2] (default task-28) Doingthings.getthing(126822)

Not sure why the [0m at the start of the log message is getting doubled up like that by logstash. The stdout in the application doesn't appear that way. But despite that, I've been trying to grok out the pattern so I can capture the information in the logs and map it to different fields on the document.

The first pattern I first tried using was:

\u001b[0m\u001b%{JAVALOGMESSAGE}

Despite the painful look of it, when I feed that + some sample logs of this format into grokconstructor.appspot.com, it says it matches the entire thing. However, when I tell logspout to grok it like so, it apparently hurts Logstash's brain as well, and returns a grok parse failure field on the event in logstash.

The configuration I am using looks like this:

filter {
grok {
match => {
"message" => "\u001b[0m\u001b%{JAVALOGMESSAGE}"
}
}
}

What am I missing here, in this grok pattern? Seems like it should work. Thanks for any help.

Why on earth are there escape sequences in your log in the first place? They don't belong there and I'd look into that instead of patching things up on the Logstash side.

Resolved the issue, it was a problem with jboss/wildfly. In wildfly's default docker image, the standalone configuration sets the stdout to have a colored pattern for the terminal.

Since the the app was running in docker, and it's console stdout was being scooped up by logspout and then sent to logstash, it was capturing that extra colored formatting information as special characters in the log message.

For posterity: anyone out there running wildfly in docker and collecting logs in logspout to send to logstash-- you need to change your wildfly configuration and set the console's named formatter to just "PATTERN" . You need to make a custom standalone xml configuration for your wildfly docker image to do that.

See here on how to customize your wildfly image:
https://goldmann.pl/blog/2014/07/23/customizing-the-configuration-of-the-wildfly-docker-image/

This is the setting you'll want to change in your custom standalone xml doc:
https://docs.jboss.org/author/display/WFLY8/Handlers#Handlers-namedformatter