Creating multiple indexes with multiple s3 inputs

Hi there,

I have a problem where I have two S3 buckets.
Bucket A monitors access logs and filters with grok.
Bucket B has csv files.

Every time i run logstash, the csv filter incorrectly tries to filter the access-logs,even though the input is for bucket B. To overcome this, I made seperate pipelines which directs to a different config. I've included tags with a condition for the output.

Now when I run logstash, the sincedbs are created and no errors are generated, however the index for bucket B is not generated and the data is not found anywhere. I know for sure that the filters are fine.

Below are my configs for each bucket:

Bucket A

        `input {
s3 {
region => "eu-west-1"
bucket => "bucket-A"
interval => "10"
tags => [ "logData" ]
additional_settings => {
  force_path_style => true
  follow_redirects => false
            }
}
}

filter {
  grok {
    patterns_dir => ["/etc/logstash/patterns"]
    match => { "message" => '%{WORD:ID} %{NOTSPACE:bucketName} \[%{NOTSPACE:datestamp} +%{INT:timezone}\] %{IPORHOST:hostName} %{NOTSPACE:requester} %{NOTSPACE:requestID} %{NOTSPACE:operation} %{NOTSPACE:key} "%{NOTSPACE:httpMethod} %{NOTSPACE:requestURI} %{NOTSPACE:protocol}" %{NOTSPACE:httpStatus} %{NOTSPACE:errorCode} %{NOTSPACE:bytesSent} %{NOTSPACE:objectSize} %{NOTSPACE:totalTime} %{NOTSPACE:turnAroundTime} %{GREEDYDATA:everythingElse}'}
    match => { "message" => '%{WORD:ID} %{NOTSPACE:bucketName} \[%{NOTSPACE:datestamp} +%{INT:timezone}\] %{IPORHOST:hostName} %{NOTSPACE:requester} %{NOTSPACE:requestID} %{NOTSPACE:operation} %{NOTSPACE:key} - %{NOTSPACE:httpStatus} %{NOTSPACE:errorCode} %{NOTSPACE:bytesSent} %{NOTSPACE:objectSize} %{NOTSPACE:totalTime} %{NOTSPACE:turnAroundTime} %{GREEDYDATA:everythingElse}'}
  }
mutate {
convert => { "totalTime" => "integer"
            "bytesSent" => "integer"
            "objectSize" => "integer"
            "turnAroundTime" => "integer"}
  }
}



 output {
  if "logData" in [tags]{
   elasticsearch {
     hosts => ["http://x.x.x.x:9200"]
     index => "access-log-%{+YYYY.MM.dd}"
     user => "elastic"
     password => "changeme"
  }
    }
 }

Bucket B

input {
s3 {
region => "eu-west-1"
bucket => "Bucket-B"
interval => "10"
tags => [ "bililngData" ]
additional_settings => {
  force_path_style => true
  follow_redirects => false
            }
}
}

output {
  if "billingData" in [tags]{
  elasticsearch {
  hosts => ["http://x.x.x.x:9200"]
  index => "billing-%{+YYYY.MM.dd}"
  user => "elastic"
  password => "changeme"
}
}
}

pipeline.yml

- pipeline.id: accessLogsPipeline
  path.config: "/etc/logstash/conf.d/*.conf"

 - pipeline.id: billingPipeline
   path.config: "/etc/logstash/billing/s3_report.conf"
   queue.type: persisted

I figured it out after some troubleshooting. The problem wasn't with pipeline, but the second bucket. I had one csv file in it, but logstash doesn't pick up that data, only new data being uploaded into it. So all I did was delete the file and upload it again and the csv shows up as well as the billing index and messages :+1:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.