I have a log file that need to be sent to logstash using filebeat. My log file of size ~500 MB. Whenever a new event is added to the log file, filebeat is sending the whole log file to logstash. I am interested in only sending the new events to the logstash.
That does not sound right, as file beat default settings would do as you describe how you want it to work. It would only send the differences of the file , assuming the file is not being completely re-written every time a new event is logged.
filebeat:
prospectors:
-
paths:
- /var/www/zap-daemon/log/zap-daemon.log
- /var/www/zap-daemon/log/test.log
input_type: log
tail_files: true
registry_file: /var/lib/filebeat/registry
output:
### Logstash as output
logstash:
# The Logstash hosts
hosts: ["172.31.59.92:5044"]
bulk_max_size: 2048
index: gui
tls:
# List of root certificates for HTTPS server verifications
certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
logging:
files:
rotateeverybytes: 10485760 # = 10MB
Another thing happening here is, initially when i push these logs to logstash, kibana is displaying the numer of hits as 117,968. But after adding a single line to the log file, kibana is showing the number of hits as 117,968+117,969. So when searching for a particular log entry, thing are being duplicated and i am getting the wrong count in a particular amount of time.
I have manually added one line to the log file, to understand the visibility in Kibana.
I do not see any log file for the filebeat service itself. Could you please tell me which output exactly you need to better understand my problem.
Here is the output from the command filebeat -e -d publish
2017/03/20 12:50:45.620875 geolite.go:24: INFO GeoIP disabled: No paths were set under output.geoip.paths
2017/03/20 12:50:45.621578 logstash.go:106: INFO Max Retries set to: 3
2017/03/20 12:50:45.704197 outputs.go:126: INFO Activated logstash as output plugin.
2017/03/20 12:50:45.704254 publish.go:232: DBG Create output worker
2017/03/20 12:50:45.704299 publish.go:274: DBG No output is defined to store the topology. The server fields might not be filled.
2017/03/20 12:50:45.704390 publish.go:288: INFO Publisher name: ip-172-31-63-75
2017/03/20 12:50:45.704533 async.go:78: INFO Flush Interval set to: 1s
2017/03/20 12:50:45.704566 async.go:84: INFO Max Bulk Size set to: 2048
2017/03/20 12:50:45.704589 async.go:92: DBG create bulk processing worker (interval=1s, bulk size=2048)
2017/03/20 12:50:45.704634 beat.go:168: INFO Init Beat: filebeat; Version: 1.3.1
2017/03/20 12:50:45.705223 beat.go:194: INFO filebeat sucessfully setup. Start running.
2017/03/20 12:50:45.705274 registrar.go:68: INFO Registry file set to: /var/lib/filebeat/registry
2017/03/20 12:50:45.705343 prospector.go:133: INFO Set ignore_older duration to 0s
2017/03/20 12:50:45.705371 prospector.go:133: INFO Set close_older duration to 1h0m0s
2017/03/20 12:50:45.705393 prospector.go:133: INFO Set scan_frequency duration to 10s
2017/03/20 12:50:45.705415 prospector.go:93: INFO Input type set to: log
2017/03/20 12:50:45.705438 prospector.go:133: INFO Set backoff duration to 1s
2017/03/20 12:50:45.705467 prospector.go:133: INFO Set max_backoff duration to 10s
2017/03/20 12:50:45.705495 prospector.go:113: INFO force_close_file is disabled
2017/03/20 12:50:45.705520 prospector.go:143: INFO Starting prospector of type: log
2017/03/20 12:50:45.705612 log.go:115: INFO Harvester started for file: /var/www/zap-daemon/log/test.log
2017/03/20 12:50:45.705727 spooler.go:77: INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2017/03/20 12:50:45.705776 log.go:115: INFO Harvester started for file: /var/www/zap-daemon/log/zap-daemon.log
2017/03/20 12:50:45.705874 crawler.go:78: INFO All prospectors initialised with 2 states to persist
2017/03/20 12:50:45.705915 registrar.go:87: INFO Starting Registrar
2017/03/20 12:50:45.705948 publish.go:88: INFO Start sending events to output
I made changes to second file, i.e., /var/www/zap-daemon/log/zap-daemon.log.
This is the registry file, after i manually added a log entry to the log file.
What is the command you used to add a line to the log file?
The log output you posted above is what I'm also interested in, but it stops directly before we see some metrics. Please wait at least 30s until you see a metric entry. Best use a gist to paste it in and then link it here.
As mentioned in the above topic, when we use echo to add a line to the log file, the problem is solved. When vi editor is used, the whole file is being shipped to the Elasticsearch.
Use echo to add lines to the log file. That solves the issue.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.