BackPressure (?) while using Kafka

janoonan · February 13, 2021, 10:57am

We have a busy production set-up generating many 1000's of logs, which are forwarded by beats to kafka before hitting logstash and elastic.

Our server paged us last week at the end of the business day (when load subsides) due to high CPU. On investigation, we found that two of our servers followed the same pattern.

Initially, at about the start of the business day, the @timestamp and timestamp (ingestion and generated time) started to diverge. (This is unusual and this is the only time we've seen it). Then at about 8pm, the CPU spiked, the @timestamp in elk spiked, and the filespace on the two servers reduced. After this, the two timestamps converged, so all good, and we've not seen it since.

My question relates to filebeats. Given Kafka is in the mix, is there a reason that filebeats appears to have slowed down and then played caught up?
This sounds like backpressure, but after talking to the solution architects, they think that kafka should have cached the data and filebeats should have worked steadily.

Is the backpressure mechanism really only TCP Congestion Control and so perhaps a busy kafka could have the same impact?

Thank you

system · March 13, 2021, 12:58pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Do we loose backpressure-sensitive protocol by inserting Kafka between Filebeat and Elasticsearch Beats docker , filebeat	1	580	July 30, 2020
Filebeats can overwhelm target (kafka,logstash,esearch) on recovery Beats filebeat	6	2995	September 27, 2016
Logstash kafka Logstash	1	211	August 16, 2022
Filebeat favors single logstash node after significant backpressure Beats filebeat	1	329	August 14, 2020
Throttling processing of log backfill using Filebeat Beats	11	3498	May 17, 2017

BackPressure (?) while using Kafka

Related topics