Logstash fields and tags. Can they be seen when being outputted to an s3 bucket? and can they be referenced on pick up from s3 bucket?

Hello,

Using ELK 7.6.

The current set up I have at the moment is that:
1 - Data source is coming in and being thrown into an s3 bucket (Bucket A) from a remote logstash instance.
2 - An ec2 instance (instance 1) is then going and collecting the data from Bucket A via filebeat which is following/ using an SQS. The logstash agent on Instance 1 filters and sorts the data into specific types using tags. Once this is done the logstash agent outputs to another s3 bucket (Bucket B) to the intended directory. on the s3 output tags are created to identify and can be seen in the s3 bucket of the naming of the file.
3 - Another ec2 instance (Instance 2) then accesses Bucket B. The logstash agent on this instance then further filters and enriches the data. Again using tags to identify what action should be taken on a specific data set.
4 - Once the filtering has been action the logstash agent then shoots the data over to the elastic cloud to then (should) be shown in kibana.

The questions are:

  • On the transportation of the data from Bucket A (picked up from Instance 1) to then send to Bucket B. Does the data/ tags/ fields that have been created or modified not able to be displayed or taken to Bucket B? do they get removed?
  • When Instance 2 has performed data enrichment on the data collected, due to the tags not being displayed from the pick up from Bucket B the data will not be interperted correctly and therefore not be shown in kibana?
  • If this is the case, is there a way to mkae the tags fields visiable when seeing the data in the s3 buckets? If there isn't does this mean that Instance 2 will have to filter again the same as Instance 1?

Examples of Confs from Instance 1 are below:

input {
  beats {
   port => 5044
}
}
filter {
  mutate {
    add_tag => ["tag1", "tag2", "eventlog"]
  }
}
output {
s3 {
  region => "eu-west-2"
  bucket => "bucket-b"
  prefix => "testing/"
  #codec => "plain"
  codec => multiline {
      pattern => "$\n"
      what => "previous"
    }
  time_file => 1
  size_file => 2048
  rotation_strategy => "size_and_time"
  validate_credentials_on_root_bucket => false
  tags => ["tag1", "tag2", "eventlog"]
}
}

Examples of Confs from Instance 2 are below:

input {
  s3 {
region => "eu-west-2"
bucket => "bucket-b"
prefix => "sprint-testing/"
codec => "json"
tags => "logstash-s3-pickup"
}
}
filter {
grok {
  match => { "message" => "%{GREEDYDATA:this_is_an_example}" }
}
	mutate {
		remove_tag => ["tag1", "tag2",, "eventlog"]
		add_tag => ["tag1", "tag2", "tag3", "eventlog"]
	}
}
output {
elasticsearch {
  hosts => ["https://XXXXXXXXXXXXXXXX.eu-west-2.aws.cloud.es.io:9243"]
  user => "username"
  password => "password"
  action => "create"
  index => "eventlogs"
  document_type => "tester"
  id => "tester_eventlogs"
}
}

Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.