Strange Filebeat alerts processing

Hi, dear community!

We've faced the following problem in our Elastic stack: Filebeat sends the processed logs - about 100-150 Gb in average per day, but instead of straight line as it always be now we see such peaks (see screenshot). We've checked both Elastic and Filebeat logs in the debug mode but found nothing. Also we've checked network - traffic is nice and smooth. Could you advise, please?
We use Elasticsearch/Filebeat 7.8.1. Thanks in advance.

The chart seems to be cut off in the screenshot so it's not possible to tell what those lines (series) mean or what their scales are (no y-axis). Could you please post the full chart?

Also, were you seeing the straight line behavior before and then it changed? If so, did you change something in your environment around the time the behavior changed?

Hi, @shaunak! Thank you for the reply!
I hope this chart looks better then the first. At the pointer normal behaviour line breaks into the peaks of re-sending logs by the Filebeat. We didn't do any changes to the Elastic cluster or our servers, it happens absolutely unexpected.
We've checked our servers that send the logs, we've checked both Elastic and Filebeat logs in debug mode and found nothing suspicious. We've tried to restart Elastic cluster. We've checked network traffic and it looks nice. We are have no more ideas what else to do. So we're here :slight_smile:

This is interesting. How are you able to tell that Filebeat is trying to re-send logs? This would happen if Filebeat was having trouble talking to Elasticsearch but then you'd see errors about in the Filebeat log.

It looks like this, i suppose. If it didn't send it in the usual way, then it send stash of the logs, and we see these peaks. Is it right?

Right, that was my first theory as well (given the extremes — 0 for some time and then a peak, 0 and then a peak, etc.). But this theory can be easily tested by looking at Filebeat logs around the times of 0 activity — we should see errors about being able to send data to Elasticsearch, retrying, etc. You mentioned in your earlier comment that you found nothing suspicious in the logs though. So I'm not sure this is due to Filebeat retrying. :thinking:

We didn't pay too much attention to the fact is there data in the Elastic or not. Let me explain: we're turn on the debug mode on the several servers that send the logs and left it for a couple of hours. Then we've filtered these logs for the key words as error, warning, disconnect and so on and found nothing. Also we've seen in the Filebeat debug log how it's send processed logs. But in the Elastic we still have nothing or these peaks.

Any chance you could shut down all but one of your Filebeat instances, then observe this chart for a bit? If it continues this zero-then-spike pattern, could you post the Filebeat logs (debug level would be nice but not required) here (appropriated redacted) from when the activity is 0?

Shaunak

Unfortunately, no, this is critical system

Understood. In that case, is there any way you could filter that chart to only show traffic coming from one Filebeat instance? I'm thinking of different ways we could narrow down the scope of this problem so we might get some clues and/or make it easier to troubleshoot.

This one i've posted earlier isn't suitable? If not, i will made one more

Oh, I didn't realize that was for traffic from a single Filebeat instance. In that case, you don't need to post a new one. If you could post some logs here (maybe a minute's worth) from that Filebeat instance (with sensitive data redacted) during the timestamps when its apparently not sending any logs over the network, that would be great. Thanks!

Okay, i will try to find something. Thanks :slight_smile:

This is the Filebeat log file during the timestamps when we not receiving any logs on the Elastic
https://file.io/xy2a5mAS3z36

That link 404s for me.

Hm, this is strange. Sorry, once again: https://dropmefiles.com/TLgVW
pwd: o1G62F

Problem was solved. Issue was on the AWS side. Thanks