Shipping gitlab job logs

HI,

I have started using filebeat for log shipping for my gitlab logs, and I have a directory structure that is changed every few minutes, each directory contains a log file which contains data that I want to ship to my elasticsearch and create dashboard from it.

I've configured my "filebeat" yaml file to scan such type of directory

- type: log
  enabled: true
  paths:
    - /artifacts/*/*/*/*/*/*/*.log
  fields:
    log: jobs-log

it starts scanning the system and when it's finished it does not continue scanning new files creation.

any idea how can I ship all the gitlab logs to elk?

Thankms

Hi @meir, welcome to the Elastic community forums!

A few questions:

  1. What version of Filebeat are you using?

  2. Have you tried defining your path as a recursive_glob, so something like:

    paths:
      - /artifacts/**/*.log
    

Thanks,

Shaunak

Hi @shaunak and thanks for replying.

  • I'm using the latest version of filebeat

    filebeat version 7.5.1 (amd64), libbeat 7.5.1 [60dd883ca29e1fdd5b8b075bd5f3698948b1d44d built 2019-12-16 21:56:14 +0000 UTC]
    
  • I have already useing this configurtation on the filebeat.yaml file.

    - type: log
      enabled: true
      scan_frequency: 0.5s
      paths:
         - /var/opt/gitlab/gitlab-ci/builds/**/*.log
      fields:
         log: gitlab-jobs-log
    

I have managed to collect those logs, now I facing new issues that filebeat is not fast enough to collect logs that are finished fast.

e.g:
I execute the pipeline, XXX.log files were created and finished in 3-5 sec then the files are moved to "artifact directory" which causes filebeat to read X lines from the log file and not all the log content,

any idea how can I grub all the log output before it's moved to "artifact directory"?

Thanks,
Meir.

Forgive me if I'm misunderstanding your setup but why not just have Filebeat collect the logs from the artifact directory? It sounds like all logs eventually get moved there and would stay there long enough for Filebeat to collect them?

Shaunak

I did it at the beginning, and those issues that I have encountered

  • The directory structure is huge and it will take some time to scan and push the structure and logs data to the elk.
  • Open File limitation, I had to add "LimitNOFILE=100000" to filebeat.service to stop yelling about it
  • Most of the log files in the "artifact directory" are old and no more relevant for us now.
  • Filebeat will enter the old logs with the current timestamp which is wrong.

those are the reasons why I don't scan the "artifact directory"

Thanks,
Meir

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.