Reading existing files with S3 Logstash Input Plugin

Hi!

I'm trying to reach some files we have in a S3 instance with Logstash. My .conf file looks like this:

input {
  s3 {
    "access_key_id" => "my_key"
    "secret_access_key" => my_secret_key"
    "bucket" => "my_bucket"
  }
}

output {
  stdout{}
}

The output shown in the terminal is as follows:

Sending Logstash logs to /home/my_user/.../tmp/logstash-6.6.1/logs which is now configured via log4j2.properties
[2019-03-04T10:59:41,177][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2019-03-04T10:59:41,190][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.6.1"}
[2019-03-04T10:59:59,101][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2019-03-04T10:59:59,335][INFO ][logstash.outputs.rabbitmq] Connected to RabbitMQ at 
[2019-03-04T10:59:59,427][INFO ][logstash.inputs.s3       ] Registering s3 input {:bucket=>"my_bucket", :region=>"us-east-1"}
[2019-03-04T10:59:59,808][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x7d681967 run>"}
[2019-03-04T10:59:59,852][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-03-04T11:00:00,133][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
S3 client configured for "us-east-1" but the bucket "my_bucket" is in "eu-west-1"; Please configure the proper region to avoid multiple unnecessary redirects and signing attempts
[2019-03-04T11:00:02,701][ERROR][logstash.inputs.s3       ] S3 input: Unable to list objects in bucket {:prefix=>nil, :message=>"The request signature we calculated does not match the signature you provided. Check your key and signing method."}

It is messing me up a little bit taking into account that the prefix seems not to be a compulsory argument... I have tested with a tiny Python script using the credentials and bucket I provide in the Logstash config file that the files are already there, but I cannot make Losgstash start indexing the previously stored files. Is Logstash's S3 plugin only monitoring new files by default? In this case, how can I make Logstash to start processing all the previously stored files?

You need to set the region option on the input.

Oops. Is it compulsory? AFAIK, it seems that it is not required. Even, if not properly provded, wouldn't it only be a bit slower?

Sending Logstash logs to /home/my_user/.../tmp/logstash-6.6.1/logs which is now configured via log4j2.properties
[2019-03-04T17:10:48,896][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2019-03-04T17:10:48,911][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.6.1"}
[2019-03-04T17:11:08,558][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2019-03-04T17:11:08,779][INFO ][logstash.outputs.rabbitmq] Connected to RabbitMQ at 
[2019-03-04T17:11:08,891][INFO ][logstash.inputs.s3       ] Registering s3 input {:bucket=>"my_bucket", :region=>"eu-west-1"}
[2019-03-04T17:11:09,231][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x73d24490 run>"}
[2019-03-04T17:11:09,275][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-03-04T17:11:09,538][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2019-03-04T17:11:11,370][ERROR][logstash.inputs.s3       ] S3 input: Unable to list objects in bucket {:prefix=>nil, :message=>"The request signature we calculated does not match the signature you provided. Check your key and signing method."}

The error seems to be the same and it might be linked to the prefix => NIL message, which keeps being shown online after 60 seconds… Note that I am still not pushing new files to the S3 instance. I only want to get "old" files...

You are right, it's not compulsory. I misread your log.

I'm starting to think that it may be an issue with the encoding of the objects because of the answers in this link. https://stackoverflow.com/questions/30518899/how-to-fix-the-request-signature-we-calculated-does-not-match-the-signature-er#30519762

Are bucket names using a "-" character error prone? And paths having a "/" like "LOGS/ES/BOT-2019_02_28.log"?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.