How to configure GCS as filebeat input

Hello Team,

We are storing our audit logs in GCS bucket. we would like to ingest them to Elasticsearch when required - not regularly - using filebeat. I have checked S3 option where it let us use s3 like storages as input using providers.

I'm using following configuration but it is not writing any data however when I test the filebeat configuration it is fine.

I doubt my input configuration is not right in someway. Please check the following and help me understand what's wrong

filebeat.inputs:
  - type: gcp
    project_id: gcp-project-xxx
    bucket_name: log-bucket
    credentials_file: /tmp/service-account-key.json

output.elasticsearch:
  hosts: "https://es-test-xxx.aivencloud.com"
  username: "avnadmin"
  password: "xxxxx"
  indices:
    - index: 'restore-test'

There is no gcp input, it's gcp-pubsub and ur config isn't valid for that. See GCP Pub/Sub input | Filebeat Reference [7.16] | Elastic on how to configure that. If all you want to do is read log files from an S3 compatible bucket, see the aws-s3 input on how to poll a bucket, AWS S3 input | Filebeat Reference [7.16] | Elastic.

First I thought the same @legoguy1000 untill I found this AWS S3 input | Filebeat Reference [master] | Elastic

I'm confused, what on that page made u think u couldn't poll a Gcp bucket that is S3 compatible? See amazon web services - How to access Google Cloud Storage bucket using aws-cli - Stack Overflow on how to do it via the AWS cli.

apologies for miscommunication from my end.

what I meant was we could use gcp provider in fileabeat input configuration the way we are using it for s3 as follows

filebeat.inputs:
- type: aws-s3
  non_aws_bucket_name: test-s3-bucket
  number_of_workers: 5
  bucket_list_interval: 300s
  access_key_id: xxxxxxx
  secret_access_key: xxxxxxx
  endpoint: https://s3.example.com:9000
  expand_event_list_from_field: Records

I just want to know how we can do the same to fetch GCS objects to Elasticsearch.

You would configure it like so. Also u can test the AWS cli the same way using the link from the previous post to test the keys and endpoint.

filebeat.inputs:
- type: aws-s3
  non_aws_bucket_name: test-s3-bucket
  number_of_workers: 5
  bucket_list_interval: 300s
  access_key_id: xxxxxxx
  secret_access_key: xxxxxxx
  endpoint: https://storage.googleapis.com

Thanks @legoguy1000
I tried above configuration however I couldn't understand the part where it is still checking for bucket_arn or queue_url even though I provided non_aws_bucket_name.

error:

WARN	[aws-s3]	awss3/config.go:54	neither queue_url nor bucket_arn were provided, input aws-s3 will stop
INFO	[crawler]	beater/crawler.go:141	Starting input (ID: 17738867761700079737)
INFO	[crawler]	beater/crawler.go:108	Loading and starting Inputs completed. Enabled inputs: 1
INFO	[input.aws-s3]	compat/compat.go:111	Input aws-s3 starting	{"id": "F62D1E3EA5C30879"}
INFO	[input.aws-s3]	compat/compat.go:124	Input 'aws-s3' stopped	{"id": "F62D1E3EA5C30879"}

should we do any changes on type? probably changing aws-s3 to gcp-gcs (I'm not sure)

My apologies, the ability to poll non AWS buckets was only added to 8.0.0 and want backpack to 7.x. You'll have to wait until 8.0 is released to be able to too what I explained. But to provide a bit more clarification, you can't just change the input names. There is a specific list of inputs that can be used, Configure inputs | Filebeat Reference [7.16] | Elastic.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.