Beats input: unhandled exception


(axil) #1

Hello
I just migrate to filebeat logstash-forwarder, and I end up with the following error message

{:timestamp=>"2016-01-13T13:46:10.478000+0100", :message=>"Beats input: unhandled exception", :exception=>#<RuntimeError: unsupported protocol 72>, :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-beats-2.0.3/lib/lumberjack/beats/server.rb:225:inhandle_version'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-beats-2.0.3/lib/lumberjack/beats/server.rb:210:in header'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-beats-2.0.3/lib/lumberjack/beats/server.rb:163:infeed'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-beats-2.0.3/lib/lumberjack/beats/server.rb:342:in read_socket'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-beats-2.0.3/lib/lumberjack/beats/server.rb:319:inrun'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-beats-2.0.3/lib/logstash/inputs/beats.rb:184:in invoke'", "org/jruby/RubyProc.java:281:incall'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.9.2-java/lib/concurrent/executor/executor_service.rb:515:in run'", "Concurrent$$JavaExecutorService$$Job_1598968662.gen:13:inrun'"], :level=>:error}`

What's the problem ?
Thank you


(Magnus Bäck) #2

Are you using Filebeat or logstash-forwarder to send data to Logstash's beats input?


(axil) #3

I use filebeat to send logs to logstash


(Magnus Bäck) #4

Have you enabled SSL on the Logstash side but not on the Filebeat side or vice versa? Please show your Filebeat and Logstash configurations.


(axil) #5

configuration filebeat
#logstash:
# The Logstash hosts
hosts: ["10.0.0.45:5044","10.0.0.30:5044"]
# Number of workers per Logstash host.
worker: 2
# Optional load balance the events between the Logstash hosts
loadbalance: true

# Optional index name. The default index name depends on the each beat.
# For Packetbeat, the default is set to packetbeat, for Topbeat
# top topbeat and for Filebeat to filebeat.
#index: filebeat

# Optional TLS. By default is off.
#tls:
  # List of root certificates for HTTPS server verifications
  #certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for TLS client authentication
  #certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #certificate_key: "/etc/pki/client/cert.key"

  # Controls whether the client verifies server certificates and host name.
  # If insecure is set to true, all server host names and certificates will be
  # accepted. In this mode TLS based connections are susceptible to
  # man-in-the-middle attacks. Use only for testing.
  #insecure: true

  # Configure cipher suites to be used for TLS connections
  #cipher_suites: []

  # Configure curve types for ECDHE based cipher suites
  #curve_types: []

File as output

Configuration logstash
#Input logstash
input {
file {
type => "syslog"
path => [ "/var/log/messages", "/var/log/*.log" ]
tags => [ "syslog" ]
}
tcp {
port => 514
type => syslog
tags => [ "syslog" ]
}
udp {
port => 514
type => syslog
tags => [ "syslog" ]
}
tcp {
port => 12530
type => syslog
tags => [ "netcat" ]
}
udp {
port => 12530
type => syslog
tags => [ "netcat" ]
}
snmptrap {
port => 162
type => 'snmptrap'
host => "0.0.0.0"
yamlmibdir => "/opt/logstash/vendor/bundle/jruby/1.9/gems/snmp-1.2.0/data/ruby/snmp/mibs/"
tags => [ "snmptrap" ]
}
redis {
key => phplogs
data_type => ['list']
}
beats {
port => 5044
tags => [ "filebeat" ]
}
}


(axil) #6

Now I have much following error message ...
{:timestamp=>"2016-01-13T14:40:31.047000+0100", :message=>"Beats input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}


(Magnus Bäck) #7

Your Logstash pipeline is clogged, probably because the outputs are failing. What kind of outputs do you have?


(axil) #8

I add the following line in the configuration file logstash
stdout { codec => rubydebug }
but I have a lot of information

for example
{
"@timestamp" => "2016-01-13T14:47:30.000Z",
"tags" => [
[0] "nagios",
[1] "_grokparsefailure"
],
"nagios_epoch" => "1452696450",
"nagios_hostname" => "RPJ00201",
"nagios_service" => "ORA-ERR:ORA16001",
"nagios_state" => "2",
"nagios_type" => "EXTERNAL COMMAND",
"nagios_check_result" => "ORA-12012: error on auto execute of job "V5CADM"."JOB_APPLY_GRANT_CARTO"",
"nagios_command" => "PROCESS_SERVICE_CHECK_RESULT",
"hour" => "2016-01-13T14.000Z",
"syslog_severity_code" => 5,
"syslog_facility_code" => 1,
"syslog_facility" => "user-level",
"syslog_severity" => "notice",
"syslog_source-IP" => "REY00060.spia.log.intra.laposte.fr",
"message-raw" => "[1452696450] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;RPJ00201;ORA-ERR:ORA16001;2;ORA-12012: error on auto execute of job "V5CADM"."JOB_APPLY_GRANT_CARTO""
}
how can you see the line that is the problem
I recently add the following lines
filter {
if "filebeat" in [tags] {
# hraccess log
}
grok {
match => ["message", "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME} %{WORD:level} Authentication succeeded - Login ID <%{USERNAME:user}> - Hint - Remote IP address <%{IP:client}/%{IP:client2}"]
match => ["message", "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME} %{WORD:level} Authentication request received - Login ID <%{USERNAME:useruknown}>"]
#match => ["message", "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME} %{WORD:level} %{GREEDYDATA:codeerror}%{IP:client}%{DATA:message}"]
#match => ["message", "%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME} %{WORD:level} %{GREEDYDATA:message}"]
}

    #
}

(axil) #9

what does it mean ?
{:timestamp=>"2016-01-13T16:27:04.380000+0100", :message=>"Beats input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}


(Magnus Bäck) #10

Is the stdout output really the only output?


(axil) #11

it is one of the outputs, I have many more, it was an example
after a few minutes of operation I have the following message
{:timestamp=>"2016-01-13T16:35:53.392000+0100", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Beats input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
{:timestamp=>"2016-01-13T16:35:53.394000+0100", :message=>"Beats input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}


(Magnus Bäck) #12

Well, as I said the error message indicates that at least one output is having trouble which slows down the whole Logstash pipeline. With the information at hand I have nothing more to add.


(axil) #13

can you tell me how to find the part that slowed the processing of data


(Steffen Siering) #14

Have you checked one of your output target logs?

With all outputs being affected by only one small output I'm not sure if you can easily debug this from logstash. Maybe people in logstash sub-forum have a better solution.

You basically have to single out the connections slowing down logstash.

some debugging in logstash:

  1. remove(comment out) all outputs and addnull output. If problem persists, filters are too slow (consider adding more filter workers?)
  2. remove null output and enable first output. If problem persists, remove output and try another one
  3. add outputs one after another until LS gets to slow. If LS gets to slow disable last one added and try another one.

Always check resource usage CPU/memory when adding more outputs. Maybe one output is too slow, or you've got too many outputs slowing down logstash in general.

If you've got too many outputs, try to disable/enable in groups first.

Once you singled-out one or two outputs check the target (e.g. mysql server) for possible optimizations.

Maybe you've got too many inputs too and try to process to many events overall in one logstash instance. Maybe your machine is not properly sized for your workload.


(system) #15