We use Filebeat to gather a wide type of logs, and we observe a strange behaviour: Kibana shows that all logs are gathered in a single instant, every couple of days.
For instance, we have 130'000 events in the Nginx http logs, but these events have all the same timestamp, and they're mostly duplicate; there are only 5 or 6 different single events.
We can read the logs manually from /var/log/nginx, and there's nothing unusual, Nginx works well.
The same happens e.g. for the syslog events: they're all gathered together. In this case, we haven't duplicates, but the timestamp is the same.
Same behaviour for the /var/log/secure.
All the events from all different harvesters have the same timestamp e.g. 13th of May 16:24:01.xxx. Only the milliseconds value (xxx) changes.
This is the file /etc/filebeats/filebeats.yml:
filebeat:
prospectors:
-
paths:
- /var/log/nginx/*access.log
- /var/log/nginx/*error.log
input_type: log
document_type: foobar-http
force_close_files: false
fields_under_root: true
close_older: 1h
fields:
env: preprod
structure: http
(other harvesters here...)
output:
logstash:
hosts: ['logs.example.com:5044']
worker: 1
loadbalance: true
index: filebeat
ssl:
certificate_authorities: ["/etc/pki/tls/certs/logstash-forwarder.crt"]
Here are the Filebeat logs:
2018-05-14T11:11:18+03:00 ERR Connecting error publishing events (retrying): read tcp 10.1.1.1:58360->10.2.2.2:5044: i/o timeout
2018-05-14T11:11:46+03:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_errors=1
2018-05-14T11:12:16+03:00 INFO No non-zero metrics in the last 30s
2018-05-14T11:12:46+03:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.write_bytes=157
2018-05-14T11:12:48+03:00 ERR Connecting error publishing events (retrying): read tcp 10.1.1.1:58650->10.2.2.2:5044: i/o timeout
2018-05-14T11:13:16+03:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_errors=1
2018-05-14T11:13:46+03:00 INFO No non-zero metrics in the last 30s
2018-05-14T11:14:16+03:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.write_bytes=157
2018-05-14T11:14:19+03:00 ERR Connecting error publishing events (retrying): read tcp 10.1.1.1:58876->10.2.2.2:5044: i/o timeout
2018-05-14T11:14:46+03:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.read_errors=1
2018-05-14T11:15:16+03:00 INFO No non-zero metrics in the last 30s
2018-05-14T11:15:46+03:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.publish.write_bytes=157
where 10.1.1.1 is the machine with the Nginx server and 10.2.2.2 is our Logstash/Kibana server.
What could be the reason of the problem? Thanks in advance.