Logstash in Kubernetes Not Using ServiceAccount with IAM Role

I have a Logstash input to pull logs from S3.

I attached a serviceaccount with IAM role to interact with S3 to the Logstash stateful set.

I checked that the env variables on the container have AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE. I even attached the serviceaccount to a test Ubuntu container with aws cli and am able to s3 ls the bucket.

Logs on the IAM side show that nothing has accessed the IAM role except for the time I tested it with a test Ubuntu container. Logstash container does not seem to even use it, despite being attached to the serviceaccount.

Can someone please advise? Has anyone run into this?

When Logstash starts up, it gives the following error.
[2021-10-06T22:32:14,762][ERROR][logstash.inputs.s3 ][main][3ba0d3d1945d30b251f9e7d0f133b1df28903dfe2099bbaba5a6270acfbe77ff] Unable to list objects in bucket {:exception=>Aws::S3::Errors::AccessDenied, :message=>"Access Denied", :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_sse_cpk.rb:19:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_dualstack.rb:24:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/s3_accelerate.rb:34:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/idempotency_token.rb:18:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/param_converter.rb:20:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/aws-sdk-core/plugins/response_paging.rb:26:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/plugins/response_target.rb:21:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/request.rb:70:in `send_request'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-core-2.11.632/lib/seahorse/client/base.rb:207:in `block in define_operation_methods'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.632/lib/aws-sdk-resources/request.rb:24:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.632/lib/aws-sdk-resources/operations.rb:139:in `all_batches'", "org/jruby/RubyEnumerator.java:396:in `each'", "org/jruby/RubyEnumerator.java:414:in `each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/aws-sdk-resources-2.11.632/lib/aws-sdk-resources/collection.rb:18:in `each'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:132:in `list_new_files'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:172:in `process_files'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:123:in `block in run'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/stud-0.0.23/lib/stud/interval.rb:20:in `interval'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-s3-3.5.0/lib/logstash/inputs/s3.rb:122:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:405:in `inputworker'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:396:in `block in start_input'"], :prefix=>"AWSLogs/REDACTED/elasticloadbalancing/us-west-2/"}

I am using the OSS version log Logstash 7.10.2.

I install it after fetching the official Logstash Helm chart.

helm upgrade -i logstash . -f ./values-NEW.yaml -n elk --create-namespace

Here are my values.yaml

image: "docker.elastic.co/logstash/logstash-oss"

replicas: 1

# Allows you to add any pipeline files in /usr/share/logstash/pipeline/
### ***warn*** there is a hardcoded logstash.conf in the image, override it first
logstashPipeline:
  logstash.conf: |
    input {
      beats {
        port => 5044
        type => "logs"
      }
      s3 {
        bucket => "acme-production-k8s-alb-logs-291704e9871ffaae5bac"
        prefix => "AWSLogs/REDACTED/elasticloadbalancing/us-west-2/"
        region => "us-west-2"
        additional_settings => {
          force_path_style => true
          follow_redirects => false
        }
        tags => [ "alb" ]
        add_field => {
          env => "production"
          aws_region => "us-west-2"
        }
      }
    }
    filter {
      if [fields][document_type] == "k8s_app" {
        date {
          match => ["time", "ISO8601"]
          remove_field => ["time"]
        }
        grok {
          match => { "[log][file][path]" => "/var/log/containers/%{DATA:k8s_pod}_%{DATA:k8s_namespace}_%{GREEDYDATA:k8s_service}-%{DATA:k8s_container_id}.log" }
          remove_field => ["[log][file][path]"]
        }
        if [k8s_service] == "platform" {
          mutate {
            update => { "[fields][document_type]" => "nodejs-app" }
          }
          if [message] =~ /^{.*}$/ {
            json {
              tag_on_failure => ["_jsonparsefailure"]
              #skip_on_invalid_json => true
              source => "message"
              target => "doc"
              add_tag => [ "_processed_message" ]
            }
            useragent {
              source => "[doc][message][userAgent]"
            }
            date {
              match => [ "timestamp", "ISO8601" ]
              remove_field => [ "timestamp" ]
            }
            mutate {
              remove_field => [ "message" ]
            }
          } else {
            mutate { add_tag => [ "_badformat" ] }
          }
        }
      }
    }
    output {
      elasticsearch {
        hosts => ["managed-elasticsearch.us-west-2.es.amazonaws.com:443"]
        ssl => true
        ssl_certificate_verification => true
        user => "${username}"
        password => "${password}"
        index => "logstash-beat-%{[fields][document_type]}-%{+YYYY.MM.dd}"
        ilm_enabled => false
      }
      #stdout { codec => rubydebug }
    }

envFrom:
  - secretRef:
      name: logstash-credentials

logstashJavaOpts: "-Xmx2g -Xms2g"

resources:
  requests:
    cpu: "100m"
    memory: "2560Mi"
  limits:
    cpu: "1000m"
    memory: "2560Mi"

volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 10Gi

# Probes
# Default probes are using `httpGet` which requires that `http.host: 0.0.0.0` is part of
# `logstash.yml`. If needed probes can be disabled or overrided using the following syntaxes:
#
# disable livenessProbe
livenessProbe: null
#
# replace httpGet default readinessProbe by some exec probe
readinessProbe:
  httpGet: null
  exec:
    command:
      - curl
      - -sS
      - 127.0.0.1:9600

rbac:
  create: true
  serviceAccountName: "logstash-logstash"
  serviceAccountAnnotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::REDACED:role/acme-iam-k8s-logstash-bf936bf3a598cbd4403ec666d2d6bbb7"

service:
  annotations:
  type: ClusterIP
  ports:
    - name: beats
      port: 5044
      protocol: TCP
      targetPort: 5044

Been using hardcoded AWS credentials. Does anyone know anything about IAM serviceaccount support? Seems like the AWS SDK for Ruby that Logstash uses is pretty old.

It does not, according to the documentation the supported ways of authentication are:

  1. Static configuration, using access_key_id and secret_access_key params in logstash plugin config
  2. External credentials file specified by aws_credentials_file
  3. Environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
  4. Environment variables AMAZON_ACCESS_KEY_ID and AMAZON_SECRET_ACCESS_KEY
  5. IAM Instance Profile (available when running inside EC2)