Hi everyone!
Hope you're doing okay, i'm making this post because I couldn't find anything online about the following issue and I wanted to make sure i'm not overlooking something simple.
I have the stack installed in a series of vms on AWS (one VM for each installation) and I have an input pipeline that uses the Elasticsearch filter to enrich some logs with data from the cluster itself.
The problem is that when logstash first starts this pipeline gives the following error:
[2021-10-27T07:35:01,103][ERROR][logstash.javapipeline ][<pipeline name>] Pipeline error {:pipeline_id=>"<pipeline name>", :exception=>#<Manticore::SocketException: Connection refused: connect>, :backtrace=>["fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:37:in `block in initialize'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:79:in `call'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:274:in `call_once'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/manticore-0.7.0-java/lib/manticore/response.rb:158:in `code'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/http/manticore.rb:84:in `block in perform_request'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/base.rb:262:in `perform_request'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/http/manticore.rb:67:in `perform_request'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/client.rb:131:in `perform_request'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/elasticsearch-api-5.0.5/lib/elasticsearch/api/actions/ping.rb:20:in `ping'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-elasticsearch-3.9.3/lib/logstash/filters/elasticsearch.rb:310:in `test_connection!'", "fullpath/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-elasticsearch-3.9.3/lib/logstash/filters/elasticsearch.rb:117:in `register'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:75:in `register'", "fullpath/logstash/logstash-core/lib/logstash/java_pipeline.rb:228:in `block in register_plugins'", "org/jruby/RubyArray.java:1809:in `each'", "fullpath/logstash/logstash-core/lib/logstash/java_pipeline.rb:227:in `register_plugins'", "fullpath/logstash/logstash-core/lib/logstash/java_pipeline.rb:586:in `maybe_setup_out_plugins'", "fullpath/logstash/logstash-core/lib/logstash/java_pipeline.rb:240:in `start_workers'", "fullpath/logstash/logstash-core/lib/logstash/java_pipeline.rb:185:in `run'", "fullpath/logstash/logstash-core/lib/logstash/java_pipeline.rb:137:in `block in start'"], "pipeline.sources"=>["fullpath/pipelines/certa-estadisticas_logstash.conf"], :thread=>"#<Thread:0x174f411d run>"}
[2021-10-27T07:35:01,447][INFO ][logstash.javapipeline ][<pipeline name>] Pipeline terminated {"pipeline.id"=>"<pipeline name>"}
As far as i have been able to debug this is because the ES nodes haven't started yet and the Elasticsearch filter seems to try to "resurrect" the connection only 2 or 3 times, leaving the pipeline off afterwards.
The rest of the pipelines on the installation have the schedule setting so this means logstash keeps running but without this particular pipeline.
Is there anyway to configure that specific pipeline to retry for longer (5 to 10 minutes)?
After all the outputs do retry the connection forever by default, I don't understand why this filter would be able to halt the pipeline entirelly until logstash is restarted with the cluster already running.
If it helps at all the cluster has 3 nodes, all of them running on separate VMs and they start up at the same time as logstash when the VM's get turned on.