Tens of minutes ingestion delay

I've got data coming from Metricbeat which is showing up in Elasticsearch some tens of minutes later rather than the second or so I might expect (I'm generating an "ingested" timestamp in Elasticsearch and comparing that with the @timestamp).

This isn't a busy system, and it's very similar to another system I've got which doesn't show this behaviour.

How to diagnose?

Is the system time on the various hosts correctly set? Is Metricbeat feeding data directly to Elasticsearch?

Metricbeat is sending direct to Elasticsearch. The Metricbeat hosts are set to UTC, the Elasticsearch ones to BST, but all have the clocks correct. When Metricbeat is restarted, for a few minutes documents come through quickly, but then it slows down, with the delay varying between around 20mins and 40 mins.

There's stuff like the following in the Elasticsearch log - I've no idea what it means or whether it matters

[2018-04-24T15:14:42,407][INFO ][o.e.d.z.ZenDiscovery     ] [live-monitor-1] master_left [{live-monitor-2}{49PcsRhMSUitVmH6PH
bTCA}{_bdlSndNSrSL7g2AFcXbhg}{172.31.12.89}{172.31.12.89:9300}], reason [failed to ping, tried [3] times, each with  maximum
[30s] timeout]
[2018-04-24T15:14:42,407][WARN ][o.e.d.z.ZenDiscovery     ] [live-monitor-1] master left (reason = failed to ping, tried [3]
times, each with  maximum [30s] timeout), current nodes: nodes:
   {live-monitor-2}{49PcsRhMSUitVmH6PHbTCA}{_bdlSndNSrSL7g2AFcXbhg}{172.31.12.89}{172.31.12.89:9300}, master
   {live-monitor-1}{xE9eAhNXQ0uBQAaPlqfuFQ}{6D5AWlLJQuWHcJHhj727Iw}{172.31.11.96}{172.31.11.96:9300}, local
   {live-monitor-3}{W0B2alNyQOqXerA94r-1PA}{O3IHPNZ2SoO-X6LRMPYDjw}{172.31.13.99}{172.31.13.99:9300}

[2018-04-24T15:14:45,452][INFO ][o.e.c.s.ClusterService   ] [live-monitor-1] detected_master {live-monitor-2}{49PcsRhMSUitVmH
6PHbTCA}{_bdlSndNSrSL7g2AFcXbhg}{172.31.12.89}{172.31.12.89:9300}, reason: zen-disco-receive(from master [master {live-monito
r-2}{49PcsRhMSUitVmH6PHbTCA}{_bdlSndNSrSL7g2AFcXbhg}{172.31.12.89}{172.31.12.89:9300} committed version [21741]])
[2018-04-24T15:15:27,225][INFO ][o.e.c.s.ClusterService   ] [live-monitor-1] removed {{live-monitor-3}{W0B2alNyQOqXerA94r-1PA
}{O3IHPNZ2SoO-X6LRMPYDjw}{172.31.13.99}{172.31.13.99:9300},}, reason: zen-disco-receive(from master [master {live-monitor-2}{
49PcsRhMSUitVmH6PHbTCA}{_bdlSndNSrSL7g2AFcXbhg}{172.31.12.89}{172.31.12.89:9300} committed version [21776]])
[2018-04-24T15:15:30,646][INFO ][o.e.c.s.ClusterService   ] [live-monitor-1] added {{live-monitor-3}{W0B2alNyQOqXerA94r-1PA}{
O3IHPNZ2SoO-X6LRMPYDjw}{172.31.13.99}{172.31.13.99:9300},}, reason: zen-disco-receive(from master [master {live-monitor-2}{49

... and on rebuilding the VMs and reinstalling everything (which I was intending to do anyway for other reasons) it's behaving OK now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.