Ignore_older concept is not working

Hi,

ignore_older concept is not working in our use case.

Please find the below our use case :

  1. We took a backup of the Elasticsearch folder installed on your Windows server at 10 AM on May 15, 2024.
  2. Elasticsearch (ES), Filebeat, and Logstash continued to work and push data to ES until 12 AM on May 15, 2024.
  3. At 12 AM, we deleted the entire ES folder.
  4. We restored the backup of ES that was taken at 10 AM on May 15, 2024.
  5. After refreshing the backup ES, we could see the data saved until 10 AM on May 15, 2024.
  6. The Filebeat log file contains monitor and harvested information until 12 PM.
  7. we require to retrieve the missing data between 10 AM and 12 PM.
  8. We set ignore_older: 2h in the Filebeat configuration (filebeat.yml).
  9. However, after restarting Filebeat and Logstash, it is not pushing the missing data, and there are no errors in the log file; only monitor entries are present.

Can you please provide your suggestion on this issue?

shared the filebeat.yml file for your reference.

=========================== Filebeat inputs =============================

filebeat.inputs:
- type: log
  enabled: true
  paths:
  - C:\LIMSAudit\AuditTextFilePath\\specimen-*.json
  fields: {log_type: specimen}
  ignore_older: 2h
  
- type: log
  enabled: true
  paths:
  - C:\LIMSAudit\AuditTextFilePath\\useractivity-*.json
  fields: {log_type: useractivity}
  ignore_older: 2h

type: log
  enabled: true
  paths:
  - C:\LIMSAudit\AuditTextFilePath\\order-*.json
  fields: {log_type: order}
  ignore_older: 2h
  
- type: log
  enabled: true
  paths:
  - C:\LIMSAudit\AuditTextFilePath\\profile-*.json
  fields: {log_type: profile}
  ignore_older: 2h

[/quote]

Hi,

I am unsure why Filebeat is not considering the ignore_older value. We need your guidance to solve this problem.

I appreciate any help you can provide.

Regards,
Babu

While Filebeat is reading log files it keeps track of the position in a file on disk. It will not reprocess already read events while this is in place. Setting ignore_older does not make Filebeat reprocess files, it just makes it skips new files it identifies that were last modified earlier than the ignore_older threshold.

It therefore sounds like you may have misunderstood the purpose of this parameter.

Thank you for the information.
Can you please advise us on how to restore the missing data between 10 AM and 12 PM?

Is there any configuration available for Filebeat? If Filebeat doesn't have any configuration, could you provide the best approach to handle this scenario?

Regards,
Babu