Logstash-8.7 fails to load YAML larger than 3MB

We are using Logstash translate plugin to add user information to IP addresses in logs/events in our organization. The user data is loaded via YAML. The file is large (5.5MB with around 15K entries).

Till Logstash-8.6, it worked correctly.

However, on upgrading to Logstash-8.7, the pipeline does not start and following error is logged in the logs

Pipeline error {:pipeline_id=>"0_main", :exception=>#<LogStash::Filters::Dictionary::DictionaryFileError: Translate: The incoming YAML document exceeds the limit: 3145728 code points. when loading dictionary file at /usr/share/user_db/user_data.yaml>, :backtrace=>["org.yaml.snakeyaml.scanner.ScannerImpl.fetchMoreTokens(ScannerImpl.java:342)", "org.yaml.snakeyaml.scanner.ScannerImpl.checkToken(ScannerImpl.java:263)", "org.yaml.snakeyaml.parser.ParserImpl$ParseBlockMappingValue.produce(ParserImpl.java:694)", "org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:185)", "org.yaml.snakeyaml.parser.ParserImpl.getEvent(ParserImpl.java:195)", "org.jruby.ext.psych.PsychParser.parse(PsychParser.java:210)", "org.jruby.ext.psych.PsychParser$INVOKER$i$parse.call(PsychParser$INVOKER$i$parse.gen)", "org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:393)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:206)", "org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:325)", "org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:72)", "org.jruby.ir.interpreter.InterpreterEngine.interpret(InterpreterEngine.java:86)"

I belive this is due to upgrade in snakeYAML or JRuby version bundled with logstash. I found a similar bug report in JRuby

Is there a way in Logstash to configure the max data limit for snakeYAML?

As a workaround I have switched to JSON because we had the same information in JSON as wel.

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.