NullPointerException in Cloudwatch Plugin

I have a logstash instance running in AWS and it has a pipeline pulling from cloudwatch for metrics. This should be the standard cloudwatch input (https://github.com/logstash-plugins/logstash-input-cloudwatch).

When I startup the instance of logstash everything is reading fine. However, after some time I notice that I am no longer receiving data from this pipeline. Checking the logs gets me this output:

[2018-05-09T12:52:32,205][INFO ][logstash.inputs.cloudwatch] Polling CloudWatch API
[2018-05-09T12:52:32,208][INFO ][logstash.inputs.cloudwatch] Polling metric DiskReadOps
[2018-05-09T12:52:32,209][INFO ][logstash.inputs.cloudwatch] Filters: [{:name=>"tag:polaris_environment", :values=>["dev"]}]
[2018-05-09T12:52:32,217][ERROR][logstash.pipeline        ] A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:cloudwatch-metrics-pipeline
  Plugin: <LogStash::Inputs::CloudWatch namespace=>"AWS/EC2", metrics=>["CPUUtilization", "DiskReadOps", "DiskWriteOps", "DiskReadBytes", "DiskWriteBytes", "NetworkIn", "NetworkOut", "NetworkPacketsIn", "NetworkPacketsOut"], filters=>{"tag:polaris_environment"=>"dev"}, region=>"us-east-2", type=>"cloudwatch-dev-ec2", id=>"af607b0567b1dadfe40bc538a69dd1c5ba2dc861213a1db38942a83ef03e9e92", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_5dd389fd-097e-4972-9824-c2cfe8305959", enable_metric=>true, charset=>"UTF-8">, use_ssl=>true, statistics=>["SampleCount", "Average", "Minimum", "Maximum", "Sum"], interval=>900, period=>300, combined=>false>
  Error: 
  Exception: Java::JavaLang::NullPointerException
  Stack: org.jruby.RubyString.getStringForPattern(RubyString.java:3741)
org.jruby.RubyString.asRegexpArg(RubyString.java:2405)
org.jruby.RubyString.subBangNoIter(RubyString.java:2445)
org.jruby.RubyString.sub_bang(RubyString.java:2398)
org.jruby.RubyString$INVOKER$i$sub_bang.call(RubyString$INVOKER$i$sub_bang.gen)
    usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.aws_minus_sdk_minus_v1_minus_1_dot_67_dot_0.lib.aws.core.client.RUBY$method$return_or_raise$0(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/aws-sdk-v1-1.67.0/lib/aws/core/client.rb:373)
usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.aws_minus_sdk_minus_v1_minus_1_dot_67_dot_0.lib.aws.core.client.RUBY$method$client_request$0(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/aws-sdk-v1-1.67.0/lib/aws/core/client.rb:476)
$_28_eval_29_.RUBY$method$describe_instances$0((eval):3)
usr.share.logstash.vendor.bundle.jruby.$2_dot_3_dot_0.gems.logstash_minus_input_minus_cloudwatch_minus_2_dot_0_dot_3.lib.logstash.inputs.cloudwatch.RUBY$method$resources$0(/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-input-cloudwatch-2.0.3/lib/logstash/inputs/cloudwatch.rb:302)

*Note I had to truncate the stack. I cut out most of the JRuby stuff and went down to what I thought was the most appropriate line. Pastebin

I looked at the code at lib/logstash/inputs/cloudwatch.rb:302

instances = clients[@namespace].describe_instances(filters: aws_filters)[:reservation_set].collect do |r|

What I think is happening is that old EC2 instances are trying to be queried. Most of our EC2 instances are fronted by beanstalk or are tore down and then brought back online with updated code. This can leave them in a terminated state where they're still shown by AWS but may no longer be in Cloudwatch (obviously not reporting metrics).

The code is assuming that the arrays are well populated by chaining method invocation on to array accessors.

I think this should be reported via a GitHub issue, but I thought I should bring it up here first.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.