when using logstash to s3 output. have tried many strategies of file_size, file_time, an rotation strategy to fix with no joy.
Do you have the S3 credentials in there, like this?
> s3{
access_key_id => "${aws_access_key}" secret_access_key => "${aws_secret_key}" region => "${aws_region}" bucket => "${aws_bucket}" codec => "json_lines" canned_acl => "bucket-owner-full-control" }
Also, whatever account you are using has to, in AWS IAM, have privileges to the S3 bucket. I do this via an IAM policy with this JSON and apply the policy to the user.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::/",
"arn:aws:s3:::"
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::/logstash-programmatic-access-test-object-"
]
}
]
}
Thank-you for responding. The data is actually getting successfully to the bucket intermittently. It is just that the pipeline get stuck intermitantly. Another important detail about the configuration is a that I run two logstash servers with the exact configuration and only the second one is effected. They are configured from filebeat with loadbalance:true
The second server that is failing also is able to write successfully, it just intermittently fails. So it writes successfully to the s3 bucket going to several sub directories and then all of a sudden, I begin to see the failures listed above. The problem to me see to be somewhat file size related. But not mater what I do to reduce the file size and only use that as the rotation strategy before sending to the s3 bucket, it continues to build file sizes that are larger in the tmp file directory.
Also, I have configure three pipelines going to and s3 bucket with different suvdirectories based on criteria. I have jind of back off he configuration, but what I have found is that the pipelines work for a certain amount of time and then begin to just stop working and I see the errors as listed initially. I have also begun using the monitoring/performance of the pipelines, again with no joy.
But to answer specifically your question; I have given the s3objects role to the ec2 instance to allow write capability to the s3 bucket.
generally...here is my config...
3 pipelines
pipeline#3 - where errors occur most often...but sometime in others......this is probably the simplest of the three.....but fails the most often......on the application group in particular..the logs are a little larger than the other groups.....but no matter what I set the size file and rotation strategy..it never picks up the file size of 2048 bytes..... it almost seems like is always using the time and size only rotation ....anyway just size does not seem to work.
input {
beats {
port => 5046
host => "xx..xx.xx.xx"
}
}
output {
if [fields] [level] =="os" {
s3 {
region => " xx-xx-xx-xx "
bucket => "bucket-logstash"
prefix => "%{+YYYY}/%{+dd}/Logs-elasticsearch-205/audit"
endpoint => "http://xxxxxxxxx"
id => "beats"
tags => "beats"
codec => "plain"
validate credentials_on_root_bucket => "false"
size_file => "2048"
rotation_strategy => "size"
enable_metric => "false"
}
}
if [fields] [level] =="application" {
s3 {
region => " xx-xx-xx-xx "
bucket => "bucket-logstash"
prefix => "%{+YYYY}/%{+dd}/Logs-elasticsearch-205/application"
endpoint => "http://xxxxxxxxx"
id => "beats"
tags => "beats"
codec => "plain"
validate credentials_on_root_bucket => "false"
size_file => "2048"
rotation_strategy => "size"
enable_metric => "false"
}
elasticsearch {
hosts => {"https://host1:9200", https://host2:9200", "https://host3:9200"}
ssl certificate_verifcation => false
index => "logstash-%{=YYYY.MM.dd}"
}
}
}
I don't use endpoint or validate credentials_on_root_bucket or enable_metric so you might try disabling those and see if that helps.
I also use
canned_acl => "bucket-owner-full-control"
though it's been so long I don't recall why, but it is a difference btwn your config and mine.
I believe I removed both of those. Again,the issue is that the data comes through somewhat...as I have three pipelines configured the exact same way and some are working and even this pipeline works for sometimes. Its just that at some point, it begins to fail. It not that the transfer never makes the connection
to the s3 bucket...its just at some pint it begins to fail. All pipelines are configured the same. All don;t experience the same problem.
Thanks for your help.
Annette
This is worth a try. Set the rotation policy to time or size and then set the timer to something like 5 minutes, which should be well below S3's socket timeout value.
ok..Sadly that did not help. The messages flow for a certain amount of time and then everything just stops running. Another observation, I made this morning was that when I stop one of the two ec2 instances with duplicate configurations in logstash accepting output from filebeat in load balance mode, it appears that the input from one instance is duplicated to the second ec2 logstash instance. I noticed there are files with the exact size in the second ec2 instance, just as I brought down the first instance. This is in spite of the fact that the output still appears in the /tmp..../home/logstash/logstash on the server that I just brought down. Is there some parameter that is making the output be duplicated to the second logstash ec2 instance and also appear on the first instance? I configured them to run in load balance mode from filebeat.
Reaching out again, as I seem to be getting no closer. I have what appears to be a performance issue. There is about a 15k limit on my s3 bucket for getting files successfully there. No matter what I try on the 6.8 logstash s3 output, I can only get the file size so small before transmitting to the s3 bucket. This is largely not an issue except if a large file comes in a very short amount of time in the logstash /tmp directory. No matter if I user just size.....at a certain point, the product is still increasing in size and thus is not getting successfully to the s3 bucket. Ther is some foo I am missing between, size, time, and rotation strategy....for s3 output. I have tried countless combinations. Any help would be appreciated. Thanks for reading this.
I have fiddled a lot with workers, batch sizes and fiebeat config items with no joy. I need to either reduce the size of the file coming from filebeat or get the file to rotate sooner in logstash so it makes it to the s3 There are about 40+ server and apps coming into this logstash server and going through the if statement like above..i.e.there are 40+ if statements for app and servers on the logstash config.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.