Logstash not processing JSON files present on S3

I am trying to read JSON files from my S3 bucket through logstash and output them on elasticssearch. Both logstash and elastic search are running on my local.

I do know logstash is able to connect to S3 by looking at logs (and if I give any of the S3 details incorrect in my logstash conf file it error me out which means that its connecting well) but somehow logstash is not processing files present in bucket and there are no errors in logstash logs.

Please let me know how I can debug it further to see what's the problem (I tried reading CSV files as well but that too not working)

Here's my logstash conf file

input {
s3 {
access_key_id => "mykeyid"
bucket => "mybucket"
region => "us-east-1"
secret_access_key => "mykey"
prefix => "/"
type => "s3"
delete => true
sincedb_path => "/Applications/logstash-5.4.1/last-s3-file"
codec => "json"
}
}

filter {

set the event timestamp

date {
match => [ "time", "UNIX" ]
}

add geoip attributes

geoip {
source => "ip"
}
}
output {
elasticsearch {
index => "bitcoin-s3-prices"
hosts => ["localhost:9200"]
}
stdout {codec => rubydebug}}

And following are the logstash logs

2017-06-15T21:39:10,082][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x72302bf6 URL:http://localhost:9200/>}
[2017-06-15T21:39:10,087][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-06-15T21:39:10,151][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"default"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2017-06-15T21:39:10,168][DEBUG][logstash.outputs.elasticsearch] Found existing Elasticsearch template. Skipping template management {:name=>"logstash"}
[2017-06-15T21:39:10,170][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::Generic:0x779060dc URL://localhost:9200>]}
[2017-06-15T21:39:10,175][INFO ][logstash.filters.geoip ] Using geoip database {:path=>"/Applications/logstash-5.4.1/vendor/bundle/jruby/1.9/gems/logstash-filter-geoip-4.0.4-java/vendor/GeoLite2-City.mmdb"}
[2017-06-15T21:39:10,192][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-15T21:39:10,200][INFO ][logstash.inputs.s3 ] Registering s3 input {:bucket=>"sidcloudbucket", :region=>"us-east-1"}
[2017-06-15T21:39:10,249][INFO ][logstash.pipeline ] Pipeline main started
[2017-06-15T21:39:10,253][DEBUG][logstash.agent ] Starting puma
[2017-06-15T21:39:10,254][DEBUG][logstash.agent ] Trying to start WebServer {:port=>9600}
[2017-06-15T21:39:10,255][DEBUG][logstash.api.service ] [api-service] start
[2017-06-15T21:39:10,311][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-15T21:39:13,257][DEBUG][logstash.agent ] Reading config file {:config_file=>"/Applications/logstash-5.4.1/logstash.conf"}
[2017-06-15T21:39:13,258][DEBUG][logstash.agent ] no configuration change for pipeline {:pipeline=>"main"}
[2017-06-15T21:39:15,252][DEBUG][logstash.pipeline ] Pushing flush onto pipeline
[2017-06-15T21:39:16,258][DEBUG][logstash.agent ] Reading config file {:config_file=>"/Applications/logstash-5.4.1/logstash.conf"}

I found the issue and it was with permissions of object in S3 bucket. I believe there's need to have improved debug logging for S3 plug-in

We also rely on the Ruby library from Amazon.

Its seems that it silently fails when retrieving objects from a bucket when incorrect credentials are used.
I suspect that this might be to deter hacking - you won't know if the bucket is indeed empty or the credentials are wrong.

If the credentials are wrong or bucket doesn't exist then it errors out that there's something wrong with details. But if there's permission issue then logs go silent

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.