Filebeat issue on windows

Hi,
I'm having an issue with filebeat on windowsXP, the same file is loaded several times,
and it causes duplicated rows. any advices?
Thanks
Paris

Filebeat.yml

filebeat.prospectors:
- input_type: log
paths:
- C:\nms2k\ems\measure\PERF*.csv
include_lines: ['RTT','RTJ']
logging.level: debug
#----------------------------- Logstash output --------------------------------
output.logstash:
hosts: ["172.20.10.35:1000"]`

Registry.log

{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":8432131,
"FileStateOS":{
"idxhi":9371648,
"idxlo":45457,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":8530531,
"FileStateOS":{
"idxhi":7667712,
"idxlo":45566,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":8697811,
"FileStateOS":{
"idxhi":17891328,
"idxlo":45393,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":8860171,
"FileStateOS":{
"idxhi":18677760,
"idxlo":45393,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":9022531,
"FileStateOS":{
"idxhi":10485760,
"idxlo":45566,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":9189811,
"FileStateOS":{
"idxhi":11272192,
"idxlo":45566,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":9352171,
"FileStateOS":{
"idxhi":21889024,
"idxlo":45393,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:37.3302254+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":10056551,
"FileStateOS":{
"idxhi":18743296,
"idxlo":47494,
"vol":940466456
},
"timestamp":"2017-10-23T17:01:59.5954229+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":10061471,
"FileStateOS":{
"idxhi":62783488,
"idxlo":45441,
"vol":940466456
},
"timestamp":"2017-10-23T17:02:36.6415866+02:00",
"ttl":-1
},
{
"source":"C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv",
"offset":6731762,
"FileStateOS":{
"idxhi":1769472,
"idxlo":47498,
"vol":940466456
},
"timestamp":"2017-10-23T17:02:49.1413466+02:00",
"ttl":-1
}
]

filebeat.log

2017-10-23T17:01:36+02:00 INFO Loading Prospectors: 1
2017-10-23T17:01:36+02:00 INFO Prospector with previous states loaded: 9
2017-10-23T17:01:36+02:00 INFO Starting prospector of type: log; id: 7490406548110299537
2017-10-23T17:01:36+02:00 INFO Loading and starting Prospectors completed. Enabled prospectors: 1
2017-10-23T17:01:36+02:00 INFO Starting Registrar
2017-10-23T17:01:36+02:00 INFO Start sending events to output
2017-10-23T17:01:36+02:00 INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2017-10-23T17:01:36+02:00 INFO Harvester started for file: C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv
2017-10-23T17:02:05+02:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=1 filebeat.harvester.running=1 filebeat.harvester.started=1 libbeat.logstash.call_count.PublishEvents=36 libbeat.logstash.publish.read_bytes=300 libbeat.logstash.publish.write_bytes=1152048 libbeat.logstash.published_and_acked_events=49056 libbeat.publisher.published_events=49056 publish.events=73595 registrar.states.current=11 registrar.states.update=73595 registrar.writes=36
2017-10-23T17:02:16+02:00 INFO Harvester started for file: C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv
2017-10-23T17:02:35+02:00 INFO Non-zero metrics in the last 30s: filebeat.harvester.open_files=1 filebeat.harvester.running=1 filebeat.harvester.started=1 libbeat.logstash.call_count.PublishEvents=35 libbeat.logstash.publish.read_bytes=210 libbeat.logstash.publish.write_bytes=1116602 libbeat.logstash.published_and_acked_events=47786 libbeat.publisher.published_events=47786 publish.events=71680 registrar.states.current=1 registrar.states.update=71680 registrar.writes=35
2017-10-23T17:02:36+02:00 INFO Harvester started for file: C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv
2017-10-23T17:02:49+02:00 INFO Stopping filebeat
2017-10-23T17:02:49+02:00 INFO Stopping Crawler
2017-10-23T17:02:49+02:00 INFO Stopping 1 prospectors
2017-10-23T17:02:49+02:00 INFO Prospector ticker stopped
2017-10-23T17:02:49+02:00 INFO Stopping Prospector: 7490406548110299537
2017-10-23T17:02:49+02:00 INFO Prospector outlet closed
2017-10-23T17:02:49+02:00 INFO Prospector channel stopped because beat is stopping.
2017-10-23T17:02:49+02:00 INFO Reader was closed: C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv. Closing.
2017-10-23T17:02:49+02:00 INFO Reader was closed: C:\nms2k\ems\measure\PERF~EMS_ERICEMS~ERIC~172.21.8.1~171023.csv. Closing.
2017-10-23T17:02:49+02:00 INFO Crawler stopped
2017-10-23T17:02:49+02:00 INFO Stopping spooler
2017-10-23T17:02:49+02:00 INFO Stopping Registrar
2017-10-23T17:02:49+02:00 INFO Ending Registrar
2017-10-23T17:02:49+02:00 INFO Total non-zero values: filebeat.harvester.closed=3 filebeat.harvester.started=3 libbeat.logstash.call_count.PublishEvents=97 libbeat.logstash.publish.read_bytes=660 libbeat.logstash.publish.write_bytes=3099239 libbeat.logstash.published_and_acked_events=130974 libbeat.publisher.published_events=132339 publish.events=196475 registrar.states.current=13 registrar.states.update=196475 registrar.writes=97
2017-10-23T17:02:49+02:00 INFO Uptime: 1m13.4829641s
2017-10-23T17:02:49+02:00 INFO filebeat stopped.

filebeat retries infinitely on IO errors. This can lead to duplicates if logstash did receive an event, but an IO error occurred, such that filebeat had to assume something is wrong and needs to send again. In your sample log I cant find any errors and only the published_and_acked_events metric is shown in the log, indicating publishing was all fine and no duplicates/retries have been send to logstash.

How are files written? This might be another issue here. It looks like all filepaths are the same, but with different file metadata. On linux the inode is used to identify a file, not the filepath. On windows the pair of (volume, idxhi, idxlo) uniquely identifies a file. The identity of your file seems to change all the time. This can be either due to you using network shares (if windows can't uniquely match identity it generates random numbers) or the application not appending to the file, but writing a completely new file (you file does not behave like a log file).

1 Like

so the problem is how the file is written, that's was my concern. basically i have to find a workaround.

the windows machine connects to several network devices and write the log every minute (append to the log in theory) however filebeat thinks is a different file everytime.

or the application not appending to the file, but writing a completely new file (you file does not behave like a log file).

That's not possible cause if i delete the file it starts to grow up from 0 bytes and i lose the previous hour

thanks for the answer
Paris

is not a filebeat issue indeed.

Thanks for the answers

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.