How to send only the newly added log events instead of the entire content of a log file?

Sharath_Vutpala · March 18, 2017, 9:47am

Hi,

I have a log file that need to be sent to logstash using filebeat. My log file of size ~500 MB. Whenever a new event is added to the log file, filebeat is sending the whole log file to logstash. I am interested in only sending the new events to the logstash.

I

eperry · March 18, 2017, 12:56pm

That does not sound right, as file beat default settings would do as you describe how you want it to work. It would only send the differences of the file , assuming the file is not being completely re-written every time a new event is logged.

Can you provide your config?

Sharath_Vutpala · March 20, 2017, 5:24am

This is my configuration file.

filebeat:
  prospectors:
    -
      paths:
        - /var/www/zap-daemon/log/zap-daemon.log
        - /var/www/zap-daemon/log/test.log
      input_type: log
      tail_files: true
  
  registry_file: /var/lib/filebeat/registry

output:
  ### Logstash as output
  logstash:
    # The Logstash hosts
    hosts: ["172.31.59.92:5044"]
    bulk_max_size: 2048
    index: gui
    tls:
      # List of root certificates for HTTPS server verifications
      certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
logging:

  files:
    rotateeverybytes: 10485760 # = 10MB

Another thing happening here is, initially when i push these logs to logstash, kibana is displaying the numer of hits as 117,968. But after adding a single line to the log file, kibana is showing the number of hits as 117,968+117,969. So when searching for a particular log entry, thing are being duplicated and i am getting the wrong count in a particular amount of time.

andrewkroh · March 20, 2017, 5:50am

Remove tail_files: true and try again.

Sharath_Vutpala · March 20, 2017, 8:56am

HI @andrewkroh,

I tried that option as well, by removing the tail_files: true. Still i am getting the same result.

ruflin · March 20, 2017, 12:45pm

How is your log file written? Could it be that your logging tool creates a new file?
Could you post the log output from filebeat here?

Sharath_Vutpala · March 20, 2017, 12:53pm

Hi @ruflin,

I have manually added one line to the log file, to understand the visibility in Kibana.

I do not see any log file for the filebeat service itself. Could you please tell me which output exactly you need to better understand my problem.

Here is the output from the command filebeat -e -d publish

2017/03/20 12:50:45.620875 geolite.go:24: INFO GeoIP disabled: No paths were set under output.geoip.paths
2017/03/20 12:50:45.621578 logstash.go:106: INFO Max Retries set to: 3
2017/03/20 12:50:45.704197 outputs.go:126: INFO Activated logstash as output plugin.
2017/03/20 12:50:45.704254 publish.go:232: DBG Create output worker
2017/03/20 12:50:45.704299 publish.go:274: DBG No output is defined to store the topology. The server fields might not be filled.
2017/03/20 12:50:45.704390 publish.go:288: INFO Publisher name: ip-172-31-63-75
2017/03/20 12:50:45.704533 async.go:78: INFO Flush Interval set to: 1s
2017/03/20 12:50:45.704566 async.go:84: INFO Max Bulk Size set to: 2048
2017/03/20 12:50:45.704589 async.go:92: DBG create bulk processing worker (interval=1s, bulk size=2048)
2017/03/20 12:50:45.704634 beat.go:168: INFO Init Beat: filebeat; Version: 1.3.1
2017/03/20 12:50:45.705223 beat.go:194: INFO filebeat sucessfully setup. Start running.
2017/03/20 12:50:45.705274 registrar.go:68: INFO Registry file set to: /var/lib/filebeat/registry
2017/03/20 12:50:45.705343 prospector.go:133: INFO Set ignore_older duration to 0s
2017/03/20 12:50:45.705371 prospector.go:133: INFO Set close_older duration to 1h0m0s
2017/03/20 12:50:45.705393 prospector.go:133: INFO Set scan_frequency duration to 10s
2017/03/20 12:50:45.705415 prospector.go:93: INFO Input type set to: log
2017/03/20 12:50:45.705438 prospector.go:133: INFO Set backoff duration to 1s
2017/03/20 12:50:45.705467 prospector.go:133: INFO Set max_backoff duration to 10s
2017/03/20 12:50:45.705495 prospector.go:113: INFO force_close_file is disabled
2017/03/20 12:50:45.705520 prospector.go:143: INFO Starting prospector of type: log
2017/03/20 12:50:45.705612 log.go:115: INFO Harvester started for file: /var/www/zap-daemon/log/test.log
2017/03/20 12:50:45.705727 spooler.go:77: INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2017/03/20 12:50:45.705776 log.go:115: INFO Harvester started for file: /var/www/zap-daemon/log/zap-daemon.log
2017/03/20 12:50:45.705874 crawler.go:78: INFO All prospectors initialised with 2 states to persist
2017/03/20 12:50:45.705915 registrar.go:87: INFO Starting Registrar
2017/03/20 12:50:45.705948 publish.go:88: INFO Start sending events to output

eperry · March 20, 2017, 2:25pm

Is this file being updated? Check the ownership of the file and compare it with who is running filebeat
registry_file: /var/lib/filebeat/registry

Sharath_Vutpala · March 21, 2017, 6:44am

This is the initial state of the registry file .

{"/var/www/zap-daemon/log/test.log":{"source":"/var/www/zap-daemon/log/test.log","offset":1036,"FileStateOS":{"inode":530079,"device":51713}},"/var/www/zap-daemon/log/zap-daemon.log":{"source":"/var/www/zap-daemon/log/zap-daemon.log","offset":13854305,"FileStateOS":{"inode":530077,"device":51713}}}

I made changes to second file, i.e., /var/www/zap-daemon/log/zap-daemon.log.
This is the registry file, after i manually added a log entry to the log file.

{"/var/www/zap-daemon/log/test.log":{"source":"/var/www/zap-daemon/log/test.log","offset":1036,"FileStateOS":{"inode":530079,"device":51713}},"/var/www/zap-daemon/log/zap-daemon.log":{"source":"/var/www/zap-daemon/log/zap-daemon.log","offset":13854322,"FileStateOS":{"inode":530073,"device":51713}}}

This file is owned by the "root" user.

Still i am getting the whole log file contents in kibana and when i am searching for a particular field, thing are duplicating.

ruflin · March 21, 2017, 7:37am

What is the command you used to add a line to the log file?

The log output you posted above is what I'm also interested in, but it stops directly before we see some metrics. Please wait at least 30s until you see a metric entry. Best use a gist to paste it in and then link it here.

Sharath_Vutpala · March 21, 2017, 10:34am

I am using vi editor to add a line to the logs.

Here is the output from the above command.

gist.github.com

https://gist.github.com/anonymous/11432400c76cf1d1a5d1b0f334f21e4f#file-gistfile1-txt

gistfile1.txt

2017/03/21 10:21:49.729133 publish.go:109: DBG  Publish: {
  "@timestamp": "2017-03-21T10:21:49.694Z",
  "beat": {
    "hostname": "ip-172-31-63-75",
    "name": "ip-172-31-63-75"
  },
  "count": 1,
  "fields": null,
  "input_type": "log",
  "message": "7269 [main] INFO hsqldb.db..ENGINE  - dataFileCache open start",

This file has been truncated. show original

Sharath_Vutpala · March 21, 2017, 10:53am

Here is the output of the filebeat log when nothing is added to the log file.

https://gist.github.com/anonymous/fe9afc51533564c81c9f31e0915ddb6c

Sharath_Vutpala · March 21, 2017, 11:31am

I followed the instructions in this topic.

As mentioned in the above topic, when we use echo to add a line to the log file, the problem is solved. When vi editor is used, the whole file is being shipped to the Elasticsearch.

Use echo to add lines to the log file. That solves the issue.

maddin2016 · March 21, 2017, 12:27pm

I can confirm this behavior. I think it is because after you edit the file with vim it generates a new inode id. Here is the output of my registry

[{"source":"/home/martin/Dokumente/test.txt","offset":24,"FileStateOS":{"inode":278184,"device":2049},"timestamp":"2017-03-21T13:24:03.635574722+01:00","ttl":-1},{"source":"/home/martin/Dokumente/test.txt","offset":30,"FileStateOS":{"inode":278217,"device":2049},"timestamp":"2017-03-21T13:24:28.728930547+01:00","ttl":-1}]

As you can see inode is different. First i changed the file 4 times with echo. And then with vim

maddin2016 · March 21, 2017, 12:29pm

I'm not the linux expert but here is an article about. http://unix.stackexchange.com/questions/36467/why-inode-value-changes-when-we-edit-in-vi-editor/37177

Sharath_Vutpala · March 21, 2017, 12:30pm

Thanks for the confirmation and link.

maddin2016 · March 21, 2017, 12:31pm

I see that is answered in the logstash topic

ruflin · March 23, 2017, 3:54pm

@maddin2016 Very interesting to know. I wasn't aware vim had this option.

system · April 20, 2017, 3:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat repeatedly sending old entries in log file Beats filebeat	3	3109	August 8, 2016
Configure filebeat to send only new events Beats filebeat	2	822	November 15, 2018
Entire log is read when it changes Beats filebeat	5	377	July 31, 2018
How to prevent old log appending existing log in elasticsearch Logstash	8	1011	February 7, 2019
Resend old logs from filebeat to logstash Beats filebeat	3	3618	April 12, 2019

How to send only the newly added log events instead of the entire content of a log file?

Related topics