Filebeat fails to parse GuardDuty json logs

Ameer_Mukadam · September 15, 2020, 2:20pm

So I am trying to ingest Guarduty logs from S3 to elasticsearch using filebeats s3 input. The logs are getting indexed but they are not getting parsed, everything goes under message field and all the logs are in json format. I even added a processor to decode json data from "message" field but then nothing gets indexed at all.

shaunak · September 15, 2020, 7:21pm

What version of Filebeat are you using?

Could you post your entire filebeat.yml configuration file, enclosed in ``` so formatting is preserved?

Could you post a couple of sample log entries from the GuardDuty logs?

Thanks,

Shaunak

Ameer_Mukadam · September 16, 2020, 5:00am

Below is my filebeat config file and I am using 7.9 version of Filebeat. The logs in s3 are gzip and in json format I will upload a sample by file also.

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: s3
  queue_url:  https://url
  access_key_id: 'accesskey'
  secret_access_key: 'secretkey'
  json.keys_under_root: true


- type: log

  # Change to true to enable this input configuration.
  enabled: false

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /etc/filebeat/*.json
  json.keys_under_root: true
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  #multiline.match: after

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false


# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  host: "10.10.10.11:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# =============================== Elastic Cloud ================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["10.10.10.11:9200"]

  # Protocol - either `http` (default) or `https`.
  protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  username: "elastic"
  password: "Tza0Ej9kL00l31PjMXmI"
  ssl.verification_mode: none

# ------------------------------ Logstash Output -------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"
#output.console:
 # pretty: true

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - decode_json_fields:
      fields: ["message"]
  - rename:
      fields:
      - from: accountId
        to: cloud.account.id
      - from: service.serviceName
        to: event.dataset
      - from: partition
        to: event.module
      - from: region
        to: cloud.region
      - from: resource.accessKeyDetails.userName
        to: user.name
      - from: service.action.awsApiCallAction.api
        to: event.action
      - from: service.action.awsApiCallAction.remoteIpDetails.city.cityName
        to: destination.geo.city_name
      - from: service.action.awsApiCallAction.remoteIpDetails.country.countryName
        to: destination.geo.country_name
      - from: service.action.awsApiCallAction.remoteIpDetails.geoLocation.lat
        to: destination.geo.location.lat
      - from: service.action.awsApiCallAction.remoteIpDetails.geoLocation.lon
        to: destination.geo.location.lon
      - from: service.action.awsApiCallAction.remoteIpDetails.ipAddressV4
        to: destination.ip
      - from: service.action.awsApiCallAction.remoteIpDetails.organization.asn
        to: destination.as.number
      - from: service.action.awsApiCallAction.remoteIpDetails.organization.org
        to: destination.as.organization.name
      - from: severity
        to: event.severity
      - from: service.detectorId
        to: event.id
  - add_fields:
      target: event
     fields:
        kind: "alert"

Ameer_Mukadam · September 16, 2020, 5:04am

The log sample, I had to unarchive it from .gz

{"schemaVersion":"2.0","accountId":"45Y3Y44Y44767","region":"ap-south-1","partition":"aws","id":"453453464544gfhgfh","arn":"arn:aws:guardduty:ap-south-1:572067442387:detector/aeb572cacc01b6eee0026c7aacf2dc85/finding/3aba3e36aedd78f103b309510e69cbee","type":"Stealth:IAMUser/CloudTrailLoggingDisabled","resource":{"resourceType":"AccessKey","accessKeyDetails":{"accessKeyId":"ADSAFSDDAFGT657657","principalId":"HRHGBGDGR67","userType":"IAMUser","userName":"arun.reddy"}},"service":{"serviceName":"guardduty","detectorId":"aeb572cacc01b6eee0026c7aacf2dc85","action":{"actionType":"AWS_API_CALL","awsApiCallAction":{"api":"DeleteTrail","serviceName":"cloudtrail.amazonaws.com","callerType":"Remote IP","remoteIpDetails":{"ipAddressV4":"157.48.175.108","organization":{"asn":"55836","asnOrg":"Reliance Jio Infocomm Limited","isp":"Jio","org":"Jio"},"country":{"countryName":"India"},"city":{"cityName":"Hyderabad"},"geoLocation":{"lat":17.3841,"lon":78.4564}},"affectedResources":{"AWS::CloudTrail::Trail":"arn:aws:cloudtrail:ap-south-1:572067442387:trail/Athena"}}},"resourceRole":"TARGET","additionalInfo":{},"evidence":null,"eventFirstSeen":"2020-09-11T08:52:12Z","eventLastSeen":"2020-09-11T08:52:12Z","archived":false,"count":1},"severity":2,"createdAt":"2020-09-11T09:09:36.314Z","updatedAt":"2020-09-11T09:09:36.314Z","title":"AWS CloudTrail trail arn:aws:cloudtrail:ap-south-1:572067442387:trail/Athena was disabled.","description":"AWS CloudTrail trail arn:aws:cloudtrail:ap-south-1:572067442387:trail/Athena was disabled by arun.reddy calling DeleteTrail under unusual circumstances. This can be attackers attempt to cover their tracks by eliminating any trace of activity performed while they accessed your account."}

shaunak · September 16, 2020, 9:46am

Thanks, I noticed in your filebeat.yml file you also have a type: log input. If you send your log sample via this input does it get parsed as expected?

Shaunak

Ameer_Mukadam · September 16, 2020, 10:25am

Hi, I got it to work it was silly mistake on my part I am very new to elastic. I did not add target: "" to the processor after decode_json. As soon as I did that it parsed the json data.

system · October 14, 2020, 12:25pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with JSON logs Beats filebeat	3	564	March 9, 2019
Advice on parsing a JSON log Logstash	5	445	November 7, 2019
Filebeat to logstash problem to parse json message Beats filebeat	7	1927	January 10, 2018
JSON logs in filestream input not added correctly on Elasticsearch Beats filebeat	1	811	April 26, 2022
Filebeat: error json decode Beats docker , filebeat	2	375	October 27, 2021

Filebeat fails to parse GuardDuty json logs

Related topics