I have deployed a logstash pod on our k8s cluster that reads from a RabbitMQ deployment. Recently we saw lag suddenly start to increase on RabbitMQ as the number of ready messages grew linearly without stopping.
The RabbitMQ sends the basic.cancel
message (the block below), and stops operating. I see there is a handleCancel message that should be handled, but I'm not sure why this does not trigger the pod to restart. The pods have to be manually restarted to fix this. Ideally, the pods should restart themselves if they shut down.
[2022-04-10T00:04:28,144][INFO ][logstash.inputs.rabbitmq ][logstash-pipeline-industrialcontrols-actions-0][f707c7aa9671e20347cbe2c66dea710d5f5e8c7737d224f2f749d0b372c12764] Received basic.cancel from , shutting down.
E, [2022-04-10T00:04:28.188523 #1] ERROR -- #<MarchHare::Session:2064 admin@localhost:5672, vhost=/>: Consumer org.jruby.proxy.com.rabbitmq.client.DefaultConsumer$Proxy6@313d35e0 (amq.ctag-jzMFgubjEY9hvEwU7KviaQ) method handleCancel for channel AMQChannel(amqp://admin@172.17.220.150:5672/,1)threw an exception for channel AMQChannel(amqp://admin@172.17.220.150:5672/,1)
E, [2022-04-10T00:04:28.212289 #1] ERROR -- #<MarchHare::Session:2064 admin@localhost:5672, vhost=/>: Unknown consumerTag (Java::JavaIo::IOException)
com.rabbitmq.client.impl.ChannelN.basicCancel(com/rabbitmq/client/impl/ChannelN.java:1476)