Old file with new name found

lwhitworth · February 15, 2016, 11:08am

I'm seeing an issue where filebeat is detecting: "Old file with new name found" an awful lot for files are most certainly not renames. Full log of one instance is:

2016-02-15T11:03:00Z INFO Old file with new name found: /data/logstash/resnet_argus.2016.02.11.08.01.01.0.txt is no /data/logstash/resnet_argus.archive/2016/02/15/resnet_argus.2016.02.15.10.01.01.0.txt
2016-02-15T11:03:00Z INFO Detected rename of a previously harvested file: /data/logstash/resnet_argus.2016.02.11.08.01.01.0.txt -> /data/logstash/resnet_argus.archive/2016/02/15/resnet_argus.2016.02.15.10.01.01.0.txt
2016-02-15T11:03:00Z INFO Harvester started for file: /data/logstash/resnet_argus.archive/2016/02/15/resnet_argus.2016.02.15.10.01.01.0.txt

These files are logs of netflow, so are retained for a few days before being deleted. I've seen this issue on filebeat v1.11 and the latest nightly build. Anyone got any pointers on what I can possible do to stop this from happening.

Cheers,

Luke

steffens · February 17, 2016, 4:55am

renames are normally detected by inodes being reused. After how many days are files deleted? Are they deleted or 'renamed' by log-rotation? Is data missing due to the false-rename detection?

lwhitworth · February 18, 2016, 8:45am

The files are deleted by a script that runs at 3am every day that looks for files older than 24 hours old and deletes them. Yes I'm seeing data missing as a result sadly.

ruflin · February 18, 2016, 3:53pm

I assume that is the same conversation here: https://github.com/elastic/filebeat/issues/271#issuecomment-185600358

Lets continue it here in this thread for the moment.

ruflin · February 18, 2016, 4:30pm

How often are new files created?
As far as I understand, you are not doing log rotation as every new file has a unique name?
Are they on a shared drive?
Can you share your filebeat config?

lwhitworth · February 18, 2016, 7:09pm

New files are created hourly
Yep each is effectively a new file
Nope the files are on local disk
I can, but I'm now not back in work until next week. I'll post it up on Monday

Cheers for the assistance!

ruflin · February 22, 2016, 4:55pm

Are this the only files that are created on this disk? Or are other files from the OS written during this time?

Looking forward to the config.

lwhitworth · February 23, 2016, 8:41am

Apologies for the delay, here's the config:

filebeat:
  prospectors:
    -
      paths:
        - "/data/logstash/janet_argus.archive/*/*/*/*.txt"
      document_type: argus
      fields:
        argus_network: janet
    -
      paths:
        - "/data/logstash/resnet_argus.archive/*/*/*/*.txt"
      document_type: argus
      fields:
        argus_network: resnet

  registry_file: /var/lib/filebeat/registry

output:
  logstash:
    hosts: ["logstashhost:5044"]
    tls:
      certificate_authorities:
        - /etc/pki/tls/certs/logstash.crt

shipper:

logging:
  to_syslog: false
  to_files: true
  files:
    path: /var/log/filebeat
    name: filebeat
    rotateeverybytes: 10485760 # = 10MB
    keepfiles: 7
  level: info

There are other files that are created on the disk yes, as these argus txt files are conversions of the raw files that are also created at the same time (so three files are created hourly, the txt file being one of them)

ruflin · February 29, 2016, 12:43pm

@lwhitworth Sorry for the late answer. Could you check the the files if the files that are detect as "renamed" if they have the same inode information? You can find the inode info also in the registry file.

We just fixed a bug with some information in the registrar which was overwriting some information. Not sure if that could have an affect in your case but worth a try? https://github.com/elastic/beats/pull/1061

jstangroome · March 23, 2016, 10:34am

Hi,

We are hitting this issue in production with filebeat 1.1.2.

I have published a gist (link at the end) with a full repro configuration, script, and console output showing that on rotation, the newly created log file receives the same inode number as the deleted oldest file and filebeat considers this a rename and does not process the new log file.

A possible improvement to filebeat which would likely reduce the production issues significantly (but won't address the exact issue demonstrated in my gist) would be to also consider file sizes. That is, in the prospector's getPreviousFile function, if os.SameFile returns true, subject the file to a further test: if the new file size is a smaller number than the last offset read, then the file is different. Would this be too broad a generalisation?

Regards,

Jason

gist.github.com

https://gist.github.com/jstangroome/b3459eaf1d96e7fac00b

console.log

+ trap 'echo Exited with $?' EXIT
+ filebeat --version
s 1.1.2
filebeat version 1.1.2 (amd64)
++ dirname /vagrant/filebeat-issue-repro.sh
+ conf_file=/vagrant/filebeat-issue-repro.yml
+ mkdir --parents /tmp/filebeat-issue-repro/
+ cd /tmp/filebeat-issue-repro/
+ rm --force ./input ./input.1 ./registry ./shipped
+ echo entry1

This file has been truncated. show original

filebeat-issue-repro.sh

#!/usr/bin/env bash
set -o errexit -o xtrace

trap "echo Exited with \$?" EXIT

filebeat --version | grep --fixed-strings 1.1.2

conf_file="$(dirname "${0}")/filebeat-issue-repro.yml"

#init

This file has been truncated. show original

filebeat-issue-repro.yml

filebeat:
  prospectors:
    -
      paths:
        - /tmp/filebeat-issue-repro/input*
      ignore_older: 1h
      max_backoff: 1s
      scan_frequency: 1s
  registry_file: /tmp/filebeat-issue-repro/registry
output:

This file has been truncated. show original

ruflin · March 23, 2016, 7:33pm

Thanks for the detailed report.

There is already a check if a file was truncated. If the size of a file is smaller then the offset, the reading should start from 0: https://github.com/elastic/beats/blob/master/filebeat/harvester/reader.go#L106

If the inode of the new file is the same as the old one, it seems you hit this issue here: https://github.com/elastic/filebeat/issues/271 There are also potential solutions discussed (which are not implemented yet).

In your case what could help is that you first create the new file and then delete the old one. I assume the inode is picked up because these 2 events happen exactly after each other.

I will dig deeper here and run your script to see what I can come up with.

jstangroome · March 23, 2016, 10:05pm

Hi Nicolas,

Thanks for responding.

If you modify filebeat-issue-repro.sh in my Gist and change line 29 from echo entry2 >>./input to something shorter, eg echo ent2 >>/.input, so that the new file is smaller than the last read offset, then the issue remains so I believe that code you linked to in harvester/reader.go is not being reached.

Our rotation is done with logrotate which always deletes the oldest file first but I think I can fudge its config and a post-rotate script to delete the oldest file after creating the new file.

Regards,

Jason

jstangroome · March 24, 2016, 12:27am

Furthermore, in my repro, after changing the rotate logic to ensure that inode numbers are not reused, filebeat works as expected.

However, a new issue is revealed when the filebeat process is restarted.

When filebeat is stopped, its registry file only contains information about the current input file and has not persisted any data about the input.1 file for which its contents had already been shipped prior to rotation.

When filebeat starts again, the contents of input.1 are re-shipped because it is has not aged sufficiently to be ignored and the registry did not contain any information about that file, its inode, or its last read offset.

I can supply an updated version of my repro script which reliably demonstrates this new problem if that would be helpful.

Regards,

Jason

ruflin · March 24, 2016, 8:00am

The file rotation bug with the registry is exactly what should have been fixed in 1.1.2 https://github.com/elastic/beats/issues/1010

billmurrin · March 29, 2016, 12:31am

I have the same issue as the poster and have confirmed that it does not always detect a truncated file and start at the beginning. Data is not sent when this occurs. When the file is detected as truncated and starts at zero I am OK and still get the data.

In my filebeats log file (mybeat):
2016-03-28T14:10:07Z INFO Old file with new name found: /mydata/hello.20160327.0940.data is no /mydata/hello.20160328.1400.data 2016-03-28T14:10:07Z INFO Detected rename of a previously harvested file: /mydata/hello.20160327.0940.data -> /mydata/hello.20160328.1400.data 2016-03-28T14:10:07Z INFO Harvester started for file: /mydata/hello.20160328.1400.data

Here are the corresponding inode's from the registry file:
`"/mydata/hello.20160324.1210.data": {
"source":"/mydata/hello.20160327.0940.data","offset":34145599,"FileStateOS": {
"inode":920,"device":64792}

"/mydata/hello.20160327.0940.data": {

"source":"/mydata/hello.20160328.1400.data","offset":31301200,"FileStateOS": {
"inode":920,"device":64792}`

Now, a little further down in my mybeat file I see another instance but this one starts again from position zero:
2016-03-28T14:30:01Z INFO Old file with new name found: /mydata/hello.20160327.1410.data is no /mydata/hello.20160328.1420.data 2016-03-28T14:30:01Z INFO Detected rename of a previously harvested file: /mydata/hello.20160327.1410.data -> /mydata/hello.20160328.1420.data 2016-03-28T14:30:01Z INFO Harvester started for file: /mydata/hello.20160328.1420.data 2016-03-28T14:30:01Z INFO File was truncated. Begin reading file from offset 0: /mydata/hello.20160328.1420.data

Here are the corresponding registry entries for that file:
"/mydata/hello.20160327.1410.data": { "source":"/mydata/hello.20160328.1420.data","offset":41483772,"FileStateOS": { "inode":5381,"device":64792} } "/mydata/hello.20160328.1420.data": { "source":"/mydata/hello.20160328.1420.data","offset":32127895,"FileStateOS": { "inode":5381,"device":64792} }

Is there a per harvester configuration directive for the registry tracking? I know for this particular dataset that it is always a new file and rotates off into a compressed state. I don't necessarily care about the tracked state for these files because only a few of them will exist at any given time. Hope that makes sense.

billmurrin · March 29, 2016, 12:58am

Just wanted to provide an update and add that I did set the ignore_older directive for that prospector in my filebeat.yml file for only a couple hours.

Hopefully that will fix my issue since the inodes won't be tracked for 24 hours any longer.

billmurrin · March 29, 2016, 2:01am

Editing that setting did not work (read a little that ignore_older should work as expected in the next release).

Just a few minutes ago, Kibana is showing a 10 minute gap in my data.
mybeat file:
2016-03-29T01:40:07Z INFO Old file with new name found: /mydata/hello.20160328.0030.data is no /mydata/hello.20160329.0130.data 2016-03-29T01:40:07Z INFO Detected rename of a previously harvested file: /mydata/hello.20160328.0030.data -> /mydata/hello.20160329.0130.data 2016-03-29T01:40:07Z INFO Harvester started for file: /mydata/hello.20160329.0130.data

It does not indicate a restart at offset 0.

Here's the corresponding Registry entries:
"/mydata/hello.20160328.0030.data":{"source":"/mydata/hello.2 0160329.0130.data","offset":39033572,"FileStateOS":{"inode":9522405,"device":64792} "/mydata/hello.20160329.0130.data":{"source":"/mydata/hello.20160329.0130.data","offset":63450665,"FileStateOS":{"inode":9522405,"device":64792}

jstangroome · March 30, 2016, 10:31pm

Hi,

I was also hoping it had been fixed but there is still an edge case apparently. Should I re-open issue 1010 or would you like to handle this another way?

You may also like to review a similar issue I've reproduced:

Regards,

Jason

billmurrin · April 1, 2016, 10:11pm

The issue appears to persist in the new version. I have ignore_older and close_older set to 1 hour and I am still running into inode re-use which is causing filebeat to think that a rename is occurring. This issue appears to be based on filebeat thinking that I am updating a previously used inode. The same types of log entries I posted before are present in the logs and I still have data gaps in kibana when this is occurring. Here is an example of what is in my log:

2016-04-01T20:20:04Z INFO Registry file updated. 9399 states written. 2016-04-01T20:20:04Z INFO Old file with new name found: /mydata/logs-20160228.2300.data is no /mydata/logs-20160401.2010.data 2016-04-01T20:20:04Z INFO Detected rename of a previously harvested file: /mydata/logs-20160228.2300.data -> /mydata/logs-20160401.2010.data 2016-04-01T20:20:04Z INFO Harvester started for file: /mydata/logs-20160401.2010.data 2016-04-01T20:20:04Z INFO Registry file updated. 9400 states written.

Sumantha_Kannantha · April 2, 2016, 5:25pm

We are also facing this issue in latest version(v1.1.2) of filebeat.
We use log4j2 for logging. In our case, we always log into Applog.log and later renam into unique file name by appending date and time stamp(ApplogDate_Time.log). Our application rolls overs frequently. Looks like filebeat is getting confused with rollover.
.../logs/
.
Applog.log
Applog_Date_Time1.log
Applog_Date_Time2.log
.
.
.
.
.
Applog_Date_TimeN.log

Later we decided to create unique timestamp at first step only instead of using Applog.log to avoid this issue in filebeat. However, as log4j appender still renames when it reaches specified Max size. Looks like this rename is creating problem.

.

Topic		Replies	Views
Filebeat wrongly detects a log rename in rotation, causing "Not resuming rotated file" Beats filebeat	2	634	March 7, 2018
Filebeat beta1 resends random data upon every restart (registry file not updated properly?) Beats filebeat	14	2246	October 20, 2016
Filebeat fails to pickup other files Beats filebeat	24	4222	July 5, 2017
Filebeat missing events because of Detected rename of a previously harvested file Beats filebeat	4	848	September 22, 2016
Filebeat lost data Beats filebeat	13	3461	August 28, 2017

Old file with new name found

Related topics