Pipeline ordering when using the new pipeline output


(Dheeraj Gupta) #1

Hi,

I have already posted about the startup/shutdown order when using multiple pipelines. Till v6.2.4, it seems that Logstash arranged the startup/shutdown of pipelines in alphabetic order.
I recently upgraded to 6.3.0 and it seems the startup/shutdown is no longer ordered (or maybe done at the same time using threads). So, I decided to switch to the new (beta) "pipeline" output to join two pipelines together. My understanding was that logstash would enforce ordering by which the upstream pipeline shuts down first and starts up last so that no data loss occurs.

My pipeline setup is

0_main (Input and processing) ---> 1_hadoop (Dedicated pipeline to send events to webhdfs)

---> indicates a pipeline input/output glue (0_main has a pipeline output to a virtualaddress "foo" and 1_hadoop defines a pipeline input at virtualaddress "foo").

When I started this setup, 1_hadoop was started first but during shutdown, it was also shutdown first, which led the 0_main to stall at shutdown with errors like

[2018-06-29T16:29:51,085][WARN ][org.logstash.plugins.pipeline.PipelineBus] Attempted to send event to 'foo' but that address was unavailable. Maybe the destination pipeline is down or stopping? Will Retry.

Eventually the timeouts kicked in and data was purged and shutdown was completed.

I have two questions:

  • Is there a way to enforce startup shutdown ordering for pipelines so that shutdown stalls can be prevented?
  • When using "pipeline" plugin as glue, are there any plans to enforce an implicit ordering (atleast of the glued pipelines) so that the downstream pipeline starts first and terminates last (which to my mind is the logical way connected pipelines should behave)?

(Dheeraj Gupta) #2

Bump!!


(Dheeraj Gupta) #3

I have also created a github issue for this topic.

Update: It looks like 6.3.1 has tried to tackle some of the issues. The startup ordering is correct when using the new pipeline plugin as glue

[2018-07-13T15:13:38,736][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.3.1"}
[2018-07-13T15:13:40,048][INFO ][org.logstash.ackedqueue.QueueUpgrade] No PQ version file found, upgrading to PQ v2.
[2018-07-13T15:13:40,416][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"1_hadoop", "pipeline.workers"=>2, "pipeline.batch.size"=>2000, "pipeline.batch.delay"=>50}
[2018-07-13T15:13:41,126][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"1_hadoop", :thread=>"#<Thread:0x12384dd6@/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:245 sleep>"}
[2018-07-13T15:13:51,339][INFO ][org.logstash.ackedqueue.QueueUpgrade] No PQ version file found, upgrading to PQ v2.
[2018-07-13T15:13:51,351][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"0_main", "pipeline.workers"=>5, "pipeline.batch.size"=>2000, "pipeline.batch.delay"=>50}

But during shutdown, all pipelines are again turned off together (or so it seems) leading to a stall

[2018-07-13T15:15:37,155][WARN ][logstash.runner          ] SIGTERM received. Shutting down.
[2018-07-13T15:15:38,018][WARN ][org.logstash.plugins.pipeline.PipelineBus] Attempted to send event to 'hadoop_pipeline' but that address was unavailable. Maybe the destination pipeline is down or stopping? Will Retry.
[2018-07-13T15:15:38,179][WARN ][org.logstash.plugins.pipeline.PipelineBus] Attempted to send event to 'hadoop_pipeline' but that address was unavailable. Maybe the destination pipeline is down or stopping? Will Retry.
[2018-07-13T15:15:38,625][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"1_hadoop", :thread=>"#<Thread:0x12384dd6@/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:245 run>"}
[2018-07-13T15:15:39,019][WARN ][org.logstash.plugins.pipeline.PipelineBus] Attempted to send event to 'hadoop_pipeline' but that address was unavailable. Maybe the destination pipeline is down or stopping? Will Retry.
[2018-07-13T15:15:39,164][WARN ][org.logstash.plugins.pipeline.PipelineBus] Attempted to send event to 'hadoop_pipeline' but that address was unavailable. Maybe the destination pipeline is down or stopping? Will Retry.

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.