S3 input plugin not sending to Elastic Search

(Patrik Iselind) #1


I have an issue where the S3 input plugin doesn't seem to propagate the processed data, at all. I can see that the files in my bucket are being processed:

logstash_1       | [2018-03-05T13:09:36,102][DEBUG][logstash.inputs.s3       ] S3 input processing {:bucket=>"idd-data", :key=>"test10/ls.s3.a8ea640e-1080-428c-930e-27eb19499027.2018-02-08T08.05.part0.txt.gz"}
logstash_1       | [2018-03-05T13:09:36,102][DEBUG][logstash.inputs.s3       ] S3 input: Download remote file {:remote_key=>"test10/ls.s3.a8ea640e-1080-428c-930e-27eb19499027.2018-02-08T08.05.part0.txt.gz", :local_filename=>"/tmp/logstash/ls.s3.a8ea640e-1080-428c-930e-27eb19499027.2018-02-08T08.05.part0.txt.gz"}
logstash_1       | [2018-03-05T13:09:36,206][DEBUG][logstash.inputs.s3       ] Processing file {:filename=>"/tmp/logstash/ls.s3.a8ea640e-1080-428c-930e-27eb19499027.2018-02-08T08.05.part0.txt.gz"}

The problem is that i have a huge number of files that will hopefully be processed by Logstash and the processed data doesn't seem to get propagated as i'd expect.

I expect the processed data to be sent to Elastic Search in "smaller" batches. My feeling is that the input plugin doesn't send the data down the pipeline until all the files have been processed. I've so far waited more than two hours on data but it doesn't appear, and the input plugin is still processing the input files.

I'm using ELK stack version 6.2.2 for this, running in docker images through docker-compose.

I'm not sure what i can/should do about this. I cannot for example find any option where i can say that data should be pushed down the pipeline either when all files have been processed for the current round or when we've received say 1000 entries/lines.

This is my pipeline.

input {
s3 {
    region => "eu-west-1"
    bucket => "idd-data"
    prefix => "${bucket_prefix:}"
    codec => json_lines

filter {

output {
    elasticsearch {
        hosts => "http://elasticsearch:9200"
    stdout {
        codec => rubydebug

Any suggestions?

(Tag V) #2

either try running logstash in debug mode (append --debug to regular command) or try removing /tmp/logstash folder (if linux machines) where .sincedb paths reside.

(Patrik Iselind) #3

Thanks i'll give it a try

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.