Logstash Pipeline from 6.1 not working 6.2.1

This is exhausting....

Creating a guide and started a new ElasticStack setup from scratch, new VMs and all. Installed ElasticStack 6.2.1 with each application on it's own VM. Started Logstash with my pipeline with no issue. Tried to ingest a file and it errors out with the following. The pipeline is a direct copy from a Logstash 6.1.1 installation that is working. I've even tried retyping the entire pipeline using Notepad++ ensuring UTF8 encoding is used.

[2018-02-19T20:30:43,161][ERROR][logstash.pipeline        ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash. {:pipeline_id=>"main", "exception"=>"ASCII-8BIT", "backtrace"=>["j
ava.nio.charset.Charset.forName(Charset.java:531)", "nokogiri.internals.SaveContextVisitor.encodeStringToHtmlEntity(SaveContextVisitor.java:758)", "nokogiri.internals.SaveContextVisitor.enter(SaveContextVisitor.java:750)", "nokogiri.XmlText.accept(XmlText.java
:92)", "nokogiri.XmlNode.native_write_to(XmlNode.java:1272)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_dot_8_dot_2_minus_java.lib.nokogiri.xml.node.RUBY$method$write_to$0(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2
-java/lib/nokogiri/xml/node.rb:697)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSi
te.java:145)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_dot_8_dot_2_minus_java.lib.nokogiri.xml.node.RUBY$method$serialize$0(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2-java/lib/nokogiri/xml/node.rb:629)", "org.jru
by.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.vendor.bundle.j
ruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_dot_8_dot_2_minus_java.lib.nokogiri.xml.node.RUBY$method$to_xml$0(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2-java/lib/nokogiri/xml/node.rb:652)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call
(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_dot_8_d
ot_2_minus_java.lib.nokogiri.xml.node.RUBY$method$to_s$0(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2-java/lib/nokogiri/xml/node.rb:513)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_dot_8_dot_2_minus_java.lib.nokogiri
.xml.node.RUBY$method$to_s$0$__VARARGS__(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2-java/lib/nokogiri/xml/node.rb)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedM
odeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_filter_minus_xml_minus_4_dot_0_dot_5.lib.logstash.filters.xml.RUBY$block$filter$
2(C:/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.5/lib/logstash/filters/xml.rb:171)", "org.jruby.runtime.CompiledIRBlockBody.yieldDirect(CompiledIRBlockBody.java:156)", "org.jruby.runtime.BlockBody.yield(BlockBody.java:114)", "org.jruby.run
time.Block.yield(Block.java:165)", "org.jruby.ir.runtime.IRRuntimeHelpers.yield(IRRuntimeHelpers.java:415)", "org.jruby.ir.targets.YieldSite.yield(YieldSite.java:87)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_dot_8_dot_2_minus_j
ava.lib.nokogiri.xml.node_set.RUBY$block$each$1(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2-java/lib/nokogiri/xml/node_set.rb:190)", "org.jruby.runtime.CompiledIRBlockBody.yieldDirect(CompiledIRBlockBody.java:156)", "org.jruby.runtime.BlockBody.y
ield(BlockBody.java:114)", "org.jruby.runtime.Block.yield(Block.java:165)", "org.jruby.RubyInteger.fixnumUpto(RubyInteger.java:162)", "org.jruby.RubyInteger.upto(RubyInteger.java:134)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.nokogiri_minus_1_d
ot_8_dot_2_minus_java.lib.nokogiri.xml.node_set.RUBY$method$each$0(C:/logstash/vendor/bundle/jruby/2.3.0/gems/nokogiri-1.8.2-java/lib/nokogiri/xml/node_set.rb:189)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_filter_minus_xml_minus_
4_dot_0_dot_5.lib.logstash.filters.xml.RUBY$block$filter$1(C:/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.5/lib/logstash/filters/xml.rb:159)", "org.jruby.runtime.CompiledIRBlockBody.yieldDirect(CompiledIRBlockBody.java:156)", "org.jruby.run
time.BlockBody.yield(BlockBody.java:114)", "org.jruby.runtime.Block.yield(Block.java:165)", "org.jruby.RubyHash$12.visit(RubyHash.java:1362)", "org.jruby.RubyHash$12.visit(RubyHash.java:1359)", "org.jruby.RubyHash.visitLimited(RubyHash.java:662)", "org.jruby.R
ubyHash.visitAll(RubyHash.java:647)", "org.jruby.RubyHash.iteratorVisitAll(RubyHash.java:1319)", "org.jruby.RubyHash.each_pairCommon(RubyHash.java:1354)", "org.jruby.RubyHash.each(RubyHash.java:1343)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.lo
gstash_minus_filter_minus_xml_minus_4_dot_0_dot_5.lib.logstash.filters.xml.RUBY$method$filter$0(C:/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.5/lib/logstash/filters/xml.rb:152)", "C_3a_.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logs
tash_minus_filter_minus_xml_minus_4_dot_0_dot_5.lib.logstash.filters.xml.RUBY$method$filter$0$__VARARGS__(C:/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-filter-xml-4.0.5/lib/logstash/filters/xml.rb)", "org.jruby.internal.runtime.methods.CompiledIRMethod.c
all(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.filters.base.RUBY$method$do_
filter$0(C:/logstash/logstash-core/lib/logstash/filters/base.rb:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.filters.base.RUBY$method$do_filter$0$__VARARGS__(C:/logstash/logstash-core/lib/logstash/filters/base.rb)", "org.jruby.internal.runtime.metho
ds.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.filters
.base.RUBY$block$multi_filter$1(C:/logstash/logstash-core/lib/logstash/filters/base.rb:164)", "org.jruby.runtime.CompiledIRBlockBody.yieldDirect(CompiledIRBlockBody.java:156)", "org.jruby.runtime.BlockBody.yield(BlockBody.java:114)", "org.jruby.runtime.Block.y
ield(Block.java:165)", "org.jruby.RubyArray.each(RubyArray.java:1734)", "C_3a_.logstash.logstash_minus_core.lib.logstash.filters.base.RUBY$method$multi_filter$0(C:/logstash/logstash-core/lib/logstash/filters/base.rb:161)", "C_3a_.logstash.logstash_minus_core.l
ib.logstash.filters.base.RUBY$method$multi_filter$0$__VARARGS__(C:/logstash/logstash-core/lib/logstash/filters/base.rb)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMetho
d.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.filter_delegator.RUBY$method$multi_filter$0(C:/logstash/logstash-core/lib/logstash/filter_delegator.rb:47)", "or
g.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:103)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:163)", "org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:200)", "org.j
ruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:161)", "org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:314)", "org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:73)", "org.j
ruby.ir.interpreter.Interpreter.INTERPRET_BLOCK(Interpreter.java:132)", "org.jruby.runtime.MixedModeIRBlockBody.commonYieldPath(MixedModeIRBlockBody.java:148)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:73)", "org.jruby.runtime.Block.call(Block.java
:124)", "org.jruby.RubyProc.call(RubyProc.java:289)", "org.jruby.internal.runtime.methods.ProcMethod.call(ProcMethod.java:63)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.pipeline.RUBY$method
$filter_batch$0(C:/logstash/logstash-core/lib/logstash/pipeline.rb:447)", "C_3a_.logstash.logstash_minus_core.lib.logstash.pipeline.RUBY$method$filter_batch$0$__VARARGS__(C:/logstash/logstash-core/lib/logstash/pipeline.rb)", "org.jruby.internal.runtime.methods
.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.pipeline.
RUBY$method$worker_loop$0(C:/logstash/logstash-core/lib/logstash/pipeline.rb:426)", "C_3a_.logstash.logstash_minus_core.lib.logstash.pipeline.RUBY$method$worker_loop$0$__VARARGS__(C:/logstash/logstash-core/lib/logstash/pipeline.rb)", "org.jruby.internal.runtim
e.methods.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "C_3a_.logstash.logstash_minus_core.lib.logstash.
pipeline.RUBY$block$start_workers$2(C:/logstash/logstash-core/lib/logstash/pipeline.rb:385)", "org.jruby.runtime.CompiledIRBlockBody.callDirect(CompiledIRBlockBody.java:145)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:71)", "org.jruby.runtime.Block.
call(Block.java:124)", "org.jruby.RubyProc.call(RubyProc.java:289)", "org.jruby.RubyProc.call(RubyProc.java:246)", "org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:104)", "java.lang.Thread.run(Thread.java:748)"], :thread=>"#<Thread:0x718b3138 sle
ep>"}

Is that too long to fit into one post? Can you also post your config?

It was.

Here's the pipeline:

input {
  file {
    id => "C:\DMARC\*.xml"
    path => "C:/DMARC/*.xml"
    discover_interval => 5
    codec => multiline {
      auto_flush_interval => 5
      negate => true
      pattern => "<record>"
      what => "previous"
    }
  }
}
filter {
  xml {
    id => "Field Extraction"
    force_array => true
    store_xml => false
    source => "message"
    xpath => [
      "record/report_metadata/org_name/text()", "report.org",
      "record/report_metadata/email/text()", "report.org_contact",
      "record/report_metadata/extra_contact_info/text()", "report.additional_contact",
      "record/report_metadata/report_id/text()", "report.id",
      "record/report_metadata/date_range/begin/text()", "report.start",
      "record/report_metadata/date_range/end/text()", "report.end",
      "record/policy_published/domain/text()", "policy.domain",
      "record/policy_published/aspf/text()", "policy.spf_mode",
      "record/policy_published/adkim/text()", "policy.dkim_mode",
      "record/policy_published/p/text()", "policy.dmarc.domain_action",
      "record/policy_published/sp/text()", "policy.dmarc.subdomain_action",
      "record/policy_published/pct/text()", "policy.percentage",
      "record/row/source_ip/text()", "email.source_ip",
      "record/row/count/text()", "email.count",
      "record/row/policy_evaluated/disposition/text()", "email.dmarc_action",
      "record/row/policy_evaluated/spf/text()", "email.spf_evaluation",
      "record/row/policy_evaluated/dkim/text()", "email.dkim_evaluation",
      "record/row/policy_evaluated/reason/type/text()", "dmarc.override_type",
      "record/row/policy_evaluated/reason/comment/text()", "dmarc.override_comment",
      "record/identifiers/envelope_to/text()", "email.envelope_to",
      "record/identifiers/envelope_from/text()", "email.envelope_from",
      "record/identifiers/header_from/text()", "email.header_from",
      "record/auth_results/dkim/domain/text()", "authresult.dkim_domain",
      "record/auth_results/dkim/result/text()", "authresult.dkim_result",
      "record/auth_results/spf/domain/text()", "authresult.spf_domain",
      "record/auth_results/spf/scope/text()", "authresult.spf_scope",
      "record/auth_results/spf/result/text()", "authresult.spf_result"
    # ]
  }
    geoip {
      id => "IP Geo-Mapping"
      source => "email.source_ip"
      add_field => {
        "[geoip][location][coordinates]" => "%{[geoip][location][lat]}, %{[geoip][location][lon]}"
      }
    }
}
output {
  elasticsearch {
    id => "Send to Elasticsearch"
    hosts => ["ElasticStack:9200"]
#    user => "elastic"
#    password => "elastic"
    http_compression => true
    template => "C:/logstash/templates/dmarcxmltemplate.json"
    template_name => "dmarcxml"
    index => "dmarcxml-%{+YYYY.MM.DD}"
  }
}

Can you gist/pastebin/etc up the error, it's a bit mangled as is, and the continuity and formatting is important :slight_smile:

https://pastebin.com/VU6QHzaU

Thanks for that.

What version of the XML filter plugin is installed? bin/logstash-plugin list '*xml*' should show that.

C:\logstash\bin>logstash-plugin list *xml*
RegexpError: (RegexpError) target of repeat operator is not specified: /*xml*/i
  block in filtered_specs at C:/logstash/lib/pluginmanager/list.rb:35
                   select at org/jruby/RubyArray.java:2565
           filtered_specs at C:/logstash/lib/pluginmanager/list.rb:35
                  execute at C:/logstash/lib/pluginmanager/list.rb:19
                      run at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67
                  execute at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/subcommand/execution.rb:11
                      run at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67
                      run at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:132
                   <main> at C:\logstash\lib\pluginmanager\main.rb:48

Just realized I was missing a couple characters in that command you gave me, doesn't appear to make a difference though. BTW, I'm using the xml filter that came with the install.

C:\logstash\bin>logstash-plugin list '*xml*'
RegexpError: (RegexpError) target of repeat operator is not specified: /*xml*/i
  block in filtered_specs at C:/logstash/lib/pluginmanager/list.rb:35
                   select at org/jruby/RubyArray.java:2565
           filtered_specs at C:/logstash/lib/pluginmanager/list.rb:35
                  execute at C:/logstash/lib/pluginmanager/list.rb:19
                      run at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67
                  execute at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/subcommand/execution.rb:11
                      run at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:67
                      run at C:/logstash/vendor/bundle/jruby/2.3.0/gems/clamp-0.6.5/lib/clamp/command.rb:132
                   <main> at C:\logstash\lib\pluginmanager\main.rb:48

What does just logstash-plugin list show?

PS C:\logstash\bin> .\logstash-plugin list
logstash-codec-cef
logstash-codec-collectd
logstash-codec-dots
logstash-codec-edn
logstash-codec-edn_lines
logstash-codec-es_bulk
logstash-codec-fluent
logstash-codec-graphite
logstash-codec-json
logstash-codec-json_lines
logstash-codec-line
logstash-codec-msgpack
logstash-codec-multiline
logstash-codec-netflow
logstash-codec-plain
logstash-codec-rubydebug
logstash-filter-aggregate
logstash-filter-anonymize
logstash-filter-cidr
logstash-filter-clone
logstash-filter-csv
logstash-filter-date
logstash-filter-de_dot
logstash-filter-dissect
logstash-filter-dns
logstash-filter-drop
logstash-filter-elasticsearch
logstash-filter-fingerprint
logstash-filter-geoip
logstash-filter-grok
logstash-filter-jdbc_static
logstash-filter-jdbc_streaming
logstash-filter-json
logstash-filter-kv
logstash-filter-metrics
logstash-filter-mutate
logstash-filter-ruby
logstash-filter-sleep
logstash-filter-split
logstash-filter-syslog_pri
logstash-filter-throttle
logstash-filter-translate
logstash-filter-truncate
logstash-filter-urldecode
logstash-filter-useragent
logstash-filter-xml
logstash-input-beats
logstash-input-dead_letter_queue
logstash-input-elasticsearch
logstash-input-exec
logstash-input-file
logstash-input-ganglia
logstash-input-gelf
logstash-input-generator
logstash-input-graphite
logstash-input-heartbeat
logstash-input-http
logstash-input-http_poller
logstash-input-imap
logstash-input-jdbc
logstash-input-kafka
logstash-input-pipe
logstash-input-rabbitmq
logstash-input-redis
logstash-input-s3
logstash-input-snmptrap
logstash-input-sqs
logstash-input-stdin
logstash-input-syslog
logstash-input-tcp
logstash-input-twitter
logstash-input-udp
logstash-input-unix
logstash-output-cloudwatch
logstash-output-csv
logstash-output-elasticsearch
logstash-output-email
logstash-output-file
logstash-output-graphite
logstash-output-http
logstash-output-kafka
logstash-output-lumberjack
logstash-output-nagios
logstash-output-null
logstash-output-pagerduty
logstash-output-pipe
logstash-output-rabbitmq
logstash-output-redis
logstash-output-s3
logstash-output-sns
logstash-output-sqs
logstash-output-stdout
logstash-output-tcp
logstash-output-udp
logstash-output-webhdfs
logstash-patterns-core
PS C:\logstash\bin> ^Q

logstash-plugin list --verbose please.

1 Like
PS C:\logstash\bin> .\logstash-plugin list --verbose
logstash-codec-cef (5.0.2)
logstash-codec-collectd (3.0.8)
logstash-codec-dots (3.0.6)
logstash-codec-edn (3.0.6)
logstash-codec-edn_lines (3.0.6)
logstash-codec-es_bulk (3.0.6)
logstash-codec-fluent (3.1.5)
logstash-codec-graphite (3.0.5)
logstash-codec-json (3.0.5)
logstash-codec-json_lines (3.0.5)
logstash-codec-line (3.0.8)
logstash-codec-msgpack (3.0.7)
logstash-codec-multiline (3.0.9)
logstash-codec-netflow (3.10.0)
logstash-codec-plain (3.0.6)
logstash-codec-rubydebug (3.0.5)
logstash-filter-aggregate (2.7.2)
logstash-filter-anonymize (3.0.6)
logstash-filter-cidr (3.1.2)
logstash-filter-clone (3.0.5)
logstash-filter-csv (3.0.8)
logstash-filter-date (3.1.9)
logstash-filter-de_dot (1.0.3)
logstash-filter-dissect (1.1.3)
logstash-filter-dns (3.0.7)
logstash-filter-drop (3.0.5)
logstash-filter-elasticsearch (3.3.0)
logstash-filter-fingerprint (3.1.2)
logstash-filter-geoip (5.0.3)
logstash-filter-grok (4.0.2)
logstash-filter-jdbc_static (1.0.0)
logstash-filter-jdbc_streaming (1.0.3)
logstash-filter-json (3.0.5)
logstash-filter-kv (4.0.3)
logstash-filter-metrics (4.0.5)
logstash-filter-mutate (3.2.0)
logstash-filter-ruby (3.1.3)
logstash-filter-sleep (3.0.6)
logstash-filter-split (3.1.6)
logstash-filter-syslog_pri (3.0.5)
logstash-filter-throttle (4.0.4)
logstash-filter-translate (3.0.4)
logstash-filter-truncate (1.0.4)
logstash-filter-urldecode (3.0.6)
logstash-filter-useragent (3.2.2)
logstash-filter-xml (4.0.5)
logstash-input-beats (5.0.6)
logstash-input-dead_letter_queue (1.1.2)
logstash-input-elasticsearch (4.2.0)
logstash-input-exec (3.1.5)
logstash-input-file (4.0.3)
logstash-input-ganglia (3.1.3)
logstash-input-gelf (3.1.0)
logstash-input-generator (3.0.5)
logstash-input-graphite (3.0.4)
logstash-input-heartbeat (3.0.5)
logstash-input-http (3.0.8)
logstash-input-http_poller (4.0.4)
logstash-input-imap (3.0.5)
logstash-input-jdbc (4.3.3)
logstash-input-kafka (8.0.4)
logstash-input-pipe (3.0.6)
logstash-input-rabbitmq (6.0.2)
logstash-input-redis (3.1.6)
logstash-input-s3 (3.2.0)
logstash-input-snmptrap (3.0.5)
logstash-input-sqs (3.0.6)
logstash-input-stdin (3.2.5)
logstash-input-syslog (3.2.4)
logstash-input-tcp (5.0.3)
logstash-input-twitter (3.0.7)
logstash-input-udp (3.2.1)
logstash-input-unix (3.0.6)
logstash-output-cloudwatch (3.0.7)
logstash-output-csv (3.0.6)
logstash-output-elasticsearch (9.0.2)
logstash-output-email (4.1.0)
logstash-output-file (4.2.1)
logstash-output-graphite (3.1.4)
logstash-output-http (5.2.0)
logstash-output-kafka (7.0.8)
logstash-output-lumberjack (3.1.5)
logstash-output-nagios (3.0.5)
logstash-output-null (3.0.4)
logstash-output-pagerduty (3.0.6)
logstash-output-pipe (3.0.5)
logstash-output-rabbitmq (5.1.0)
logstash-output-redis (4.0.3)
logstash-output-s3 (4.0.13)
logstash-output-sns (4.0.6)
logstash-output-sqs (5.0.2)
logstash-output-stdout (3.1.3)
logstash-output-tcp (5.0.2)
logstash-output-udp (3.0.5)
logstash-output-webhdfs (3.0.5)
logstash-patterns-core (4.1.2)
PS C:\logstash\bin>

So,

is the last line in the Logstash code before we eventually reach

which makes the fatal call to Charset.forName() that triggers the exception. The ASCII-8BIT charset name that it seems to dislike is taken from the Ruby-detected (?) charset of the input XML string:

My guess is that the input string isn't actually UTF-8 but some other charset that can't be detected, hence it falls back to the (bogus) ASCII-8BIT charset.

So, check your input file. Does it have any 8-bit characters? If so, is it really UTF-8?

1 Like

So it's not the pipeline, it's the file that's being parsed?

I'm 100% certain it's not the file, unless something was changed between 6.1.1 and 6.2.1 in regards to how the XML filter works. I'm using the same ~200 xml files on both systems with the same pipeline. They work on 6.1.1 and not on 6.2.1.

I also ran through a couple different methods of changing the file encoding to UTF8 with same results in the logstash pipeline.

Notepad++, ensuring Encoding is set to UTF8
PowerShell Set-Content -Encoding UTF8
PowerShell Out-File -Encoding UTF8

The xml filter is unchanged between 6.1.1 and 6.2.1 but the Nokogiri XML library was bumped from 1.8.1 to 1.8.2, which contains a fix for https://github.com/sparklemotion/nokogiri/issues/1659 and that issue is clearly related to what you're seeing. However, it actually looks like that bugfix should've fixed the problem you're having, i.e. from the look of things it should've been 6.1.1 that was broken and 6.2.1 that was working. I clearly don't understand this so I won't dig any further.

1 Like

Thanks for that digging @magnusbaeck, seriously amazing stuff!

@wwalker would you be up to going back to the upstream library and raising this with them?

Haha, I definitely would be willing...if I understood anything you said. Is the upstream library Nokogiri and I'd submit the issue with them on github?

Here's a fun twist, I found out how to make it work and what the difference was.

In my 6.1.1 implementation, I installed logstash as a service utilizing Non-Sucking Service Manager (NSSM) and then used -f pathtopipeline. If I remove the -f argument and try to load the config from pipelines.yml, it fails with the same error as above.

In 6.2.1, if I specify -f pathtopipeline, I still get the same error.