I am facing some issue that logstash does not want to reconnect to the rabbitMQ after losing the connection somehow.
I am using two logstash instances on two different VMs, these two logstash instances listen to the same RabbitMQ queue.
One logstash experienced an error and stopped the data ingestion. while the other one was not affected at all. Before, both logstashes had been running for around 12 hours.
Both machines are sharing the same configuration, located in the same domain, listened to the same queue, but they reacted differently.
I restarted the affected logstash service, and everything works perfect since then.
I found this error in the logs,
Error: #method<channel.close>(reply-code=406, reply-text=PRECONDITION_FAILED - unknown delivery tag 665, class-id=60, method-id=80) Exception: MarchHare::ChannelAlreadyClosed Stack: /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/march_hare-4.1.1-java/lib/march_hare/exceptions.rb:121:in `convert_and_reraise'
After the error above in the logs, I found those retries, and they are updated every second
Error while setting up connection for rabbitmq input! Will retry. {:message=>"#method<channel.close>(reply-code=406, reply-text=PRECONDITION_FAILED - unknown delivery tag 665, class-id=60, method-id=80)", :class=>"MarchHare::ChannelAlreadyClosed", :location=>"/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/march_hare-4.1.1-java/lib/march_hare/exceptions.rb:121:in `convert_and_reraise'"}
It looks like the logstash tried to restore the connection every second but it did not succeed, until I restarted the logstash service.
I also found a similar error below in the logs around two hours before the above issue occured,
Error: MarchHare::ChannelAlreadyClosed Exception: MarchHare::ChannelAlreadyClosed Stack: /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/march_hare-4.1.1-java/lib/march_hare/exceptions.rb:121:in `convert_and_reraise'
After this error I also noticed a few retries like these, but after a few retries the connection restored.
RabbitMQ connection was closed! {:url=>[redacted], :automatic_recovery=>true, :cause=>com.rabbitmq.client.ShutdownSignalException: connection error}
In a nutshell, the logstash met some connection issue with RabbitMQ, the first time it restored, the second time the recovery failed. The only difference between these two connection issue is 'reply-code=406, reply-text=PRECONDITION_FAILED - unknown delivery tag 665, class-id=60, method-id=80'.
I have two questions,
- What is the difference between these two connection errors? Explicitly, what does the 'reply-code=406' mean?
- Why did the restoration succeed the first time but failed the second?
The simple configuration that I am using looks like this,
input {
rabbitmq {
host => [redacted]
port => [redacted]
vhost => 'xxx'
passive => true
queue => [redacted]
ssl => true
user => [redacted]
password => [redacted]
prefetch_count => 2
}
}