Logstash not processing JSON files present on S3

sidharth1985 · June 16, 2017, 1:56am

I am trying to read JSON files from my S3 bucket through logstash and output them on elasticssearch. Both logstash and elastic search are running on my local.

I do know logstash is able to connect to S3 by looking at logs (and if I give any of the S3 details incorrect in my logstash conf file it error me out which means that its connecting well) but somehow logstash is not processing files present in bucket and there are no errors in logstash logs.

Please let me know how I can debug it further to see what's the problem (I tried reading CSV files as well but that too not working)

Here's my logstash conf file

input {
s3 {
access_key_id => "mykeyid"
bucket => "mybucket"
region => "us-east-1"
secret_access_key => "mykey"
prefix => "/"
type => "s3"
delete => true
sincedb_path => "/Applications/logstash-5.4.1/last-s3-file"
codec => "json"
}
}

filter {

set the event timestamp

date {
match => [ "time", "UNIX" ]
}

add geoip attributes

geoip {
source => "ip"
}
}
output {
elasticsearch {
index => "bitcoin-s3-prices"
hosts => ["localhost:9200"]
}
stdout {codec => rubydebug}}

And following are the logstash logs

2017-06-15T21:39:10,082][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<URI::HTTP:0x72302bf6 URL:http://localhost:9200/>}
[2017-06-15T21:39:10,087][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2017-06-15T21:39:10,151][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"default"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword"}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2017-06-15T21:39:10,168][DEBUG][logstash.outputs.elasticsearch] Found existing Elasticsearch template. Skipping template management {:name=>"logstash"}
[2017-06-15T21:39:10,170][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>[#<URI::Generic:0x779060dc URL://localhost:9200>]}
[2017-06-15T21:39:10,175][INFO ][logstash.filters.geoip ] Using geoip database {:path=>"/Applications/logstash-5.4.1/vendor/bundle/jruby/1.9/gems/logstash-filter-geoip-4.0.4-java/vendor/GeoLite2-City.mmdb"}
[2017-06-15T21:39:10,192][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>500}
[2017-06-15T21:39:10,200][INFO ][logstash.inputs.s3 ] Registering s3 input {:bucket=>"sidcloudbucket", :region=>"us-east-1"}
[2017-06-15T21:39:10,249][INFO ][logstash.pipeline ] Pipeline main started
[2017-06-15T21:39:10,253][DEBUG][logstash.agent ] Starting puma
[2017-06-15T21:39:10,254][DEBUG][logstash.agent ] Trying to start WebServer {:port=>9600}
[2017-06-15T21:39:10,255][DEBUG][logstash.api.service ] [api-service] start
[2017-06-15T21:39:10,311][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2017-06-15T21:39:13,257][DEBUG][logstash.agent ] Reading config file {:config_file=>"/Applications/logstash-5.4.1/logstash.conf"}
[2017-06-15T21:39:13,258][DEBUG][logstash.agent ] no configuration change for pipeline {:pipeline=>"main"}
[2017-06-15T21:39:15,252][DEBUG][logstash.pipeline ] Pushing flush onto pipeline
[2017-06-15T21:39:16,258][DEBUG][logstash.agent ] Reading config file {:config_file=>"/Applications/logstash-5.4.1/logstash.conf"}

sidharth1985 · June 28, 2017, 6:59pm

I found the issue and it was with permissions of object in S3 bucket. I believe there's need to have improved debug logging for S3 plug-in

guyboertje · June 29, 2017, 12:22pm

We also rely on the Ruby library from Amazon.

Its seems that it silently fails when retrieving objects from a bucket when incorrect credentials are used.
I suspect that this might be to deter hacking - you won't know if the bucket is indeed empty or the credentials are wrong.

sidharth1985 · June 29, 2017, 8:16pm

If the credentials are wrong or bucket doesn't exist then it errors out that there's something wrong with details. But if there's permission issue then logs go silent

system · July 27, 2017, 8:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
S3 input and Elasticsearch output in Logstash Logstash	3	2203	July 6, 2017
Logstash: Amazon S3 Trouble Logstash	1	652	July 6, 2017
Passs Json input from s3 files to elasticsearch Logstash	4	1816	July 5, 2019
S3 bucket data not processing correctly Logstash	2	461	November 7, 2017
Logstash output not working when using logstash-s3-plugin Logstash	4	1161	October 20, 2017

Logstash not processing JSON files present on S3

set the event timestamp

add geoip attributes

Related topics