Background:
We are using logstash s3 input plugin to ingest the logs from s3 bucket in AWS, however, we observed in the logstash plain logs, there are constant plugin errors that lead to the restart of the plugin. And the errors appeared to be caused by failed TCP connection.
Each time the plugin restarts, it seems that it will re-iterate the objects in the s3 bucket over again before processing the logs. This seems to contribute a lagging to ingest those logs
The logstash config is working and ingesting logs.
The same errors happened to different s3 buckets and there are average ~43times of the error each day.
Details
We tried below,
-
Noticed that the error message "Error: Failed to open TCP connection to bucketA.s3.eu-central-1.amazonaws.com:443 (initialize: name or service not known)", we examined the logs from TCPdump, the DNS resolution succeeded and returns a valid IP
-
Increased the JVM Heap Size from 1gb to 4gb
-
Tried to reduced the folders that needed to be ingested, it reduced the time taken to iterate the objects in s3 therefore reduced the lagging, but the plugin restart error still exists.
Error Message Sample
Masked some of the info, such as the pipeline ID, bucket name etc.
[2023-01-06T00:10:34,014][ERROR][logstash.javapipeline ][pipeline A][pipeline ID A] A plugin had an unrecoverable error. Will restart this plugin.
Pipeline_id: pipeline A
Plugin: <LogStash::Inputs::S3 bucket=>"bucket A", include_object_properties=>true, prefix=>"Prefix A", id=>"pipeline ID A", region=>"eu-central-1", type=>"A-log", sincedb_path=>"/var/lib/logstash/plugins/inputs/s3/sincedb_A", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_A", enable_metric=>true, charset=>"UTF-8">, role_session_name=>"logstash", delete=>false, interval=>60, watch_for_new_files=>true, temporary_directory=>"/tmp/logstash", gzip_pattern=>".gz(ip)?$">
Error: Failed to open TCP connection to bucketA.s3.eu-central-1.amazonaws.com:443 (initialize: name or service not known)
Exception: Seahorse::Client::NetworkingError
Stack: /usr/share/logstash/vendor/jruby/lib/ruby/stdlib/net/http.rb:943:in `block in connect'
org/jruby/ext/timeout/Timeout.java:114:in `timeout'
org/jruby/ext/timeout/Timeout.java:90:in `timeout'
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/net/http.rb:939:in `connect'
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/net/http.rb:924:in `do_start'
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/net/http.rb:919:in `start'
/usr/share/logstash/vendor/jruby/lib/ruby/stdlib/delegate.rb:83:in `method_missing'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/net_http/connection_pool.rb:285:in `start_session'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/net_http/connection_pool.rb:92:in `session_for'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/net_http/handler.rb:119:in `session'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/net_http/handler.rb:71:in `transmit'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/net_http/handler.rb:45:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/plugins/content_length.rb:12:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_request_signer.rb:88:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_request_signer.rb:23:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_host_id.rb:14:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/xml/error_handler.rb:8:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/helpful_socket_errors.rb:10:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_request_signer.rb:65:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_redirects.rb:15:in `call'
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/retry_errors.rb:108:in `call'
Not sure anyone ran into the same issues / would know about how to fix it.