Filebeat Cisco Umbrella Fileset

Hi. I'm trying to set up the Filebeat Cisco module with the Umbrella fileset. I understand that they do not yet support Cisco managed S3 instances but I see that you can set the input to be file. I can't find anything about how to actually set this up though. I have a script that is syncing the .gz files locally on my server. I assume that I then need a var.path which I set to the location that I am downloading the files but nothing seems to happen. I even unzipped one of the files and moved it to /var/lib/umbrella/ and still nothing. Here is my cisco.yml from modules.d:

# Module: cisco
# Docs: https://www.elastic.co/guide/en/beats/filebeat/7.10/filebeat-module-cisco.html

- module: cisco
  asa:
    enabled: false

    # Set which input to use between syslog (default) or file.
    #var.input: syslog

    # The interface to listen to UDP based syslog traffic. Defaults to
    # localhost. Set to 0.0.0.0 to bind to all available interfaces.
    #var.syslog_host: localhost

    # The UDP port to listen for syslog traffic. Defaults to 9001.
    #var.syslog_port: 9001

    # Set the log level from 1 (alerts only) to 7 (include all messages).
    # Messages with a log level higher than the specified will be dropped.
    # See https://www.cisco.com/c/en/us/td/docs/security/asa/syslog/b_syslog/syslogs-sev-level.html
    #var.log_level: 7

  ftd:
    enabled: false

    # Set which input to use between syslog (default) or file.
    #var.input: syslog

    # The interface to listen to UDP based syslog traffic. Defaults to
    # localhost. Set to 0.0.0.0 to bind to all available interfaces.
    #var.syslog_host: localhost

    # The UDP port to listen for syslog traffic. Defaults to 9003.
    #var.syslog_port: 9003

    # Set the log level from 1 (alerts only) to 7 (include all messages).
    # Messages with a log level higher than the specified will be dropped.
    # See https://www.cisco.com/c/en/us/td/docs/security/firepower/Syslogs/b_fptd_syslog_guide/syslogs-sev-level.html
    #var.log_level: 7

  ios:
    enabled: false

    # Set which input to use between syslog (default) or file.
    #var.input: syslog

    # The interface to listen to UDP based syslog traffic. Defaults to
    # localhost. Set to 0.0.0.0 to bind to all available interfaces.
    #var.syslog_host: localhost

    # The UDP port to listen for syslog traffic. Defaults to 9002.
    #var.syslog_port: 9002

    # Set custom paths for the log files when using file input. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

  nexus:
    enabled: false

    # Set which input to use between udp (default), tcp or file.
    # var.input: udp
    # var.syslog_host: localhost
    # var.syslog_port: 9506

    # Set paths for the log files when file input is used.
    # var.paths:

    # Toggle output of non-ECS fields (default true).
    # var.rsa_fields: true

    # Set custom timezone offset.
    # "local" (default) for system timezone.
    # "+02:00" for GMT+02:00
    # var.tz_offset: local

  meraki:
    enabled: false

    # Set which input to use between udp (default), tcp or file.
    # var.input: udp
    # var.syslog_host: localhost
    # var.syslog_port: 9525

    # Set paths for the log files when file input is used.
    # var.paths:

    # Toggle output of non-ECS fields (default true).
    # var.rsa_fields: true

    # Set custom timezone offset.
    # "local" (default) for system timezone.
    # "+02:00" for GMT+02:00
    # var.tz_offset: local

  umbrella:
    enabled: true

    var.input: file
    var.paths: ["/var/lib/umbrella/*"]
    # AWS SQS queue url
    #var.queue_url: https://sqs.us-east-1.amazonaws.com/ID/CiscoQueue
    # Access ID to authenticate with the S3 input
    #var.access_key_id: 123456
    # Access key to authenticate with the S3 input
    #var.secret_access_key: PASSWORD
    # The duration that the received messages are hidden from ReceiveMessage request
    #var.visibility_timeout: 300s
    # Maximum duration before AWS API request will be interrupted
    #var.api_timeout: 120s

Your config seems good. Could you please share your debug logs?

Hello !
I have same problem:
2021-02-02T18:38:27.193Z DEBUG [input] input/input.go:139 Run input
2021-02-02T18:38:27.193Z DEBUG [input] log/input.go:205 Start next scan
2021-02-02T18:38:27.193Z DEBUG [input] log/input.go:302 Skipping directory: /home/ubuntu/umbrella/dnslogs
2021-02-02T18:38:27.193Z DEBUG [input] log/input.go:226 input states cleaned up. Before: 0, After: 0, Pending: 0

in the config I have:
umbrella:
enabled: true
var.input: file
var.paths: ["/home/ubuntu/umbrella/*"]

thank you !

it is fixed when I unzip csv.gz files
any chance that I do not need to do that ?

Hello @YegorKovylyayev and @InnerJoin .

The Cisco umbrella module, at least the initial version was made with S3 in mind. In terms of the support for Cisco managed S3 buckets it is something we are looking into and want to improve if possible.

While file input should in theory work, you might find some issues there since the current focus was S3 first. The file input does not support archived files (tar, zip, gz etc), and the Cisco Umbrella files are compressed per default.

Now if you uncompressed them and read them it should work, as our local tests uses local log files, but it does require you to follow the directory structure of Cisco Umbrella, at least the last folder.

You can store the logs wherever you want, but the last folder name is important, let's say you store them in /var/logs/cisco/*, and the Umbrella directory structure is something like this:

That means your folder name would be (file name can be anything you want):

/var/logs/cisco/dnslogs/something.csv
/var/logs/cisco/proxylogs/something.csv
/var/logs/cisco/iplogs/something.csv
/var/logs/cisco/cloudfirewalllogs/something.csv

Your file input would then be

umbrella:
  enabled: true
  var.input: file
  var.paths: ["/var/logs/cisco/*/*.csv"]

Thank you @Marius_Iversen it is working fine.

Sorry I kept forgetting to come back in here and reply but I just ended up pulling he logs down with the aws cli tool then parsing the logs with logstash since that will read the compressed files.