Seeing Grok regexp exception "incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)"

Hi everyone,

I am seeing this error in my logstash logs

[2018-02-15T10:00:48,009][WARN ][logstash.filters.grok    ] Grok regexp threw exception {:exception=>"incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)", :backtrace=>["org/jruby/RubyRegexp.java:1107:in `match'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/jls-grok-0.11.4/lib/grok-pure.rb:182:in `execute'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.2/lib/logstash/filters/grok/timeout_enforcer.rb:20:in `grok_till_timeout'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.2/lib/logstash/filters/grok.rb:342:in `block in match_against_groks'"
"org/jruby/RubyArray.java:1734:in `each'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.2/lib/logstash/filters/grok.rb:339:in `match_against_groks'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.2/lib/logstash/filters/grok.rb:328:in `match'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.2/lib/logstash/filters/grok.rb:296:in `block in filter'"
"org/jruby/RubyHash.java:1343:in `each'"
"/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.2/lib/logstash/filters/grok.rb:295:in `filter'"
"/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145:in `do_filter'"
"/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164:in `block in multi_filter'"
"org/jruby/RubyArray.java:1734:in `each'"
"/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161:in `multi_filter'"
"/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:47:in `multi_filter'"
"(eval):18987:in `block in initialize'"
"org/jruby/RubyArray.java:1734:in `each'"
"(eval):18984:in `block in initialize'"
"(eval):3140:in `block in filter_func'"
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:447:in `filter_batch'"
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:426:in `worker_loop'"
"/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:385:in `block in start_workers'"], :class=>"Encoding::CompatibilityError

The interesting part of the error seems to be
incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)

I believe (though have not completely proven) it is being thrown because of this character : ®

Is there a way to configure grok to deal with this character?
Logstash is not dropping the offending log lines, it is just not parsing them.

If a grok-configuration fix is not possible work around suggestions are welcome, though I can't stop that character from appearing in my app logs, as it is put there by a user's input.

2 Likes

Same problem here, we've got a "μ" in a grok regex causing this, resulting in a stacktrace very similar to above.

The issue only appeared for us only after upgrading logstash. The problem was not present in 6.0.1, but at least appears for us with 6.1.1 (same on 6.2.1), installed via https://artifacts.elastic.co/packages/6.x/apt on Ubuntu.

Stacktrace for us:

[2018-02-20T09:26:45,504][WARN ][logstash.filters.grok ] Grok regexp threw exception {:exception=>"incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)", :backtrace=>["org/jruby/RubyRegexp.java:1107:in match'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/jls-grok-0.11.4/lib/grok-pure.rb:182:inexecute'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.1/lib/logstash/filters/grok/timeout_enforcer.rb:20:in grok_till_timeout'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.1/lib/logstash/filters/grok.rb:347:inblock in match_against_groks'", "org/jruby/RubyArray.java:1734:in each'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.1/lib/logstash/filters/grok.rb:344:inmatch_against_groks'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.1/lib/logstash/filters/grok.rb:333:in match'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.1/lib/logstash/filters/grok.rb:301:inblock in filter'", "org/jruby/RubyHash.java:1343:in each'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-grok-4.0.1/lib/logstash/filters/grok.rb:300:infilter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145:in do_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164:inblock in multi_filter'", "org/jruby/RubyArray.java:1734:in each'", "/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161:inmulti_filter'", "/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:48:in multi_filter'", "(eval):823:inblock in initialize'", "org/jruby/RubyArray.java:1734:in each'", "(eval):808:inblock in initialize'", "(eval):524:in block in filter_func'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:455:infilter_batch'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:434:in worker_loop'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:393:inblock in start_workers'"], :class=>"Encoding::CompatibilityError"}

I believe that we only started seeing this happen in 6.2, but we don't have logs going back before we upgraded, so I can't confirm.

I as well am receiving this error on 6.2 and no errors previously on 5 or 6.0 or 6.1

Same here. Never had a problem with encoding until 6.2.x.

For everyone who hasnt posted what character appears to be breaking their parsing, could you?

I am having the same problem with log entries with the micro second symbol µ.

I was originally using 6.1.3. I upgraded to 6.2.2 but the problem remains. Same errors in the logs.

Anyone come up with a work around?

Did anyone figure out what the issue is?

I have not found a solution.

We are just eating the errors for now.

I to am getting this error. I have Swedish characters in some logs (Å Ä Ö).

We are aware of this issue and it has been fixed already in https://github.com/elastic/logstash/pull/9307.
The fix will be included in the upcoming release (version 6.2.4).

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.