Filebeat resent all events of a logfile - every day

scheffler · July 24, 2020, 8:06am

Hi

The logfile has always the same name, but a housekeeping job copy and zip the file every night, with the side effect that the file gets a new inode. I guess that the inode is harvesters primary key in the registry and filebeat assumes a new file - don't care about the same name.
tail_files: true in filebeat.yml doesn't help, because it is a new file for filebeat.

A test for this effect is when you change a logfile for test cases with vi. If you save the file with :wq it gets a new i node. (stat before and start after saving). Elastic shows all entries of the file, but if you use the pipe echo 'errormessage' >> filename, only the last error will send.

Has anybody an idea to solve the problem?

Cheers,
Heinz

mtojek · July 27, 2020, 8:38am

Take a look at file_identity property: https://www.elastic.co/guide/en/beats/filebeat/master/filebeat-input-log.html

scheffler · July 28, 2020, 6:12am

Ok, this seems to be the solution (from the discryption point of view). I put much time in it - with no solution.
For my case I have exact path/filenames without wildcards, so file_identity.path should be the way. I tried every possible configuration regarding the file_identity.path property, but it doesn't work. One new entry with vi and filebeat sents all again.
There is no documentation about the value of path. I found just one example:
file_identity.path: ~
I tried
file_identity.path:
file_identity.path: ~
file_identity.path: true
file_identity.path: 'path/filename'

but... nothing works for me. What means ~ ?

mtojek · July 28, 2020, 8:01am

Can you post the whole config you're using? The property file_identity.path: ~ should be fine.

scheffler · July 28, 2020, 1:39pm

This is the relevant section:

filebeat.inputs:

- type: log

  enabled: true
  paths:
    - /disk00/app/oracle/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
    - /disk00/app/oracle/diag/rdbms/aitdev01/AITDEV01/trace/alert_AITDEV01.log
    - /disk00/app/oracle/diag/rdbms/aitdev02/AITDEV02/trace/alert_AITDEV02.log
    - /disk00/app/oracle/diag/rdbms/aitma02/AITMA02/trace/alert_AITMA02.log
    - /disk00/app/oracle/diag/rdbms/aitprd01/AITPRD01/trace/alert_AITPRD01.log
    - /disk00/app/oracle/diag/rdbms/aitprd02/AITPRD02/trace/alert_AITPRD02.log
    - /disk00/app/oracle/diag/rdbms/cusprd01/CUSPRD01/trace/alert_CUSPRD01.log
    - /disk00/app/oracle/diag/rdbms/cusprd03/CUSPRD03/trace/alert_CUSPRD03.log
    - /disk00/app/oracle/diag/rdbms/cusprd05/CUSPRD05/trace/alert_CUSPRD05.log
    - /disk00/app/oracle/diag/rdbms/cusqa02/CUSQA02/trace/alert_CUSQA02.log
    - /disk00/app/oracle/diag/rdbms/gfadev/GFADEV/trace/alert_GFADEV.log
    - /disk00/app/oracle/diag/rdbms/gfaprd/GFAPRD/trace/alert_GFAPRD.log
    - /disk00/app/oracle/diag/rdbms/hipadb/HIPADB/trace/alert_HIPADB.log
    - /disk00/app/oracle/diag/rdbms/prodb/PRODB/trace/alert_PRODB.log
    - /home/delphe/dummy.log
  
  #tail_files: true
  file_identity.path: ~

  include_lines: ['ORA-']                               


  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}'      # Regex for the pattern which matches the beginning of the interesting logentry

  multiline.negate: true                               
  multiline.match: after                               


  # Example of an error log entry:
  # 2020-04-19T06:00:00.497793+02:00
  # Errors in file /disk00/app/oracle/diag/rdbms/aitdev01/AITDEV01/trace/AITDEV01_j003_143635.trc:
  # ORA-12012: error on auto execute of job "SYS"."ORA$AT_SQ_SQL_SW_4628"
  # ORA-38153: Software edition is incompatible with SQL plan management.
  # ORA-06512: at "SYS.DBMS_SPM_INTERNAL", line 6202
  # ORA-06512: at "SYS.DBMS_SPM", line 2806
  # ORA-06512: at line 34
  # 2020-04-19T06:00:19.115017+02:00                     # Timestamp i.e. begin of the next error. The last entry will not appear since the logentry is the last one.

  processors:
  - copy_fields:
      fields:
        - from: message
          to: oracle.message
      fail_on_error: false
      ignore_missing: true

  - truncate_fields:
      fields:
      - message
      max_characters: 33 
      fail_on_error: false
      ignore_missing: true

  - timestamp:
      field: message
      layouts:
        - '2020-04-19T06:00:00.497793+02:00'
      target_field: oracle.timestamp


  # Maybe not the best solution, but current state of my knowledge. More add_fields processor insctances to write the ORACLE-Instance in a new field called "oracle.instance.name" 
  - add_fields:
      when:
         contains:
            log.file.path: "ASM1"
      target: oracle.instance
      fields: 
        name: "ASM1"

  - add_fields:
      when:
         contains:
            log.file.path: "AITDEV01"
      target: oracle.instance
      fields: 
        name: "AITDEV01"

  - add_fields:
      when:
         contains:
            log.file.path: "AITDEV02"
      target: oracle.instance
      fields: 
        name: "AITDEV02"

  - add_fields:
      when:
         contains:
            log.file.path: "AITMA02"
      target: oracle.instance
      fields: 
        name: "AITMA02"

  - add_fields:
      when:
         contains:
            log.file.path: "AITPRD01"
      target: oracle.instance
      fields: 
        name: "AITPRD01"

  - add_fields:
      when:
         contains:
            log.file.path: "AITPRD02"
      target: oracle.instance
      fields: 
        name: "AITPRD02"

  - add_fields:
      when:
         contains:
            log.file.path: "CUSPRD01"
      target: oracle.instance
      fields: 
        name: "CUSPRD01"

  - add_fields:
      when:
         contains:
            log.file.path: "CUSPRD05"
      target: oracle.instance
      fields: 
        name: "CUSPRD05"

  - add_fields:
      when:
         contains:
            log.file.path: "CUSPRD03" 
      target: oracle.instance
      fields: 
        name: "CUSPRD03"

  - add_fields:
      when:
         contains:
            log.file.path: "CUSQA02"
      target: oracle.instance
      fields: 
        name: "CUSQA02"       

  - add_fields:
      when:
         contains:
            log.file.path: "GFAPRD"
      target: oracle.instance
      fields: 
        name: "GFAPRD"       

  - add_fields:
      when:
         contains:
            log.file.path: "PRODB"
      target: oracle.instance
      fields: 
        name: "PRODB"       

  - add_fields:
      when:
         contains:
            log.file.path: "HIPADB"
      target: oracle.instance
      fields: 
        name: "HIPADB"       

  - add_fields:
      when:
         contains:
            log.file.path: "GFADEV"
      target: oracle.instance
      fields: 
        name: "GFADEV"       

  # dummy scans the file /home/delphe/dummy.log for test cases
  - add_fields:
      when:
         contains:
            log.file.path: "dummy"
      target: oracle.instance
      fields: 
        name: "dummy"       

  



#============================= Filebeat modules ===============================

filebeat.config.modules:
....

Thanks,
Heinz

mtojek · July 28, 2020, 3:18pm

One more thing. Which version of filebeat are you using? As the file_identity is really fresh and I believe it's currently available on master and 7.x branched. It didn't become a part of any release yet.

scheffler · July 29, 2020, 5:40am

Hi Marcin

We use 7.6.2

Cheers,
Heinz

mtojek · July 29, 2020, 6:58am

You can try to build the latest master and verify if that version works for you.

kvch · July 29, 2020, 12:04pm

Have you tried to exclude the copied files from harvesting with exclude_files option?

scheffler · July 30, 2020, 8:57am

Thanks for the hint, but I use path/filenames without wildcards and the ziped files are in a complete different directory.

system · August 27, 2020, 10:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filestream monitorization issue Beats elastic-stack-monitoring , filebeat	0	15	August 30, 2024
Filestream monitorization issue Beats beats-module , filebeat	1	17	September 11, 2024
Best configuration to avoid message loss due to inode reuse? Beats filebeat	5	3011	June 20, 2017
Filebeat sending duplicates events Beats filebeat	2	972	December 23, 2021
Need inode or file timestamp in the event Beats filebeat	4	330	December 20, 2018

Filebeat resent all events of a logfile - every day

Related topics