Logstash 6.4 use beats input plugin to collect large log file, but too slow

PriestTomb · February 4, 2019, 5:29am

Hello,

I use Logstash 6.4 and Filebeat 6.4 to collect large log file(just a test), not the whole file is large, but that each line is large, like 10KB a line.

When a line of logs is only 1KB in size, the test results are normal. But when i use 10KB per line of log files, it is found that Logstash will process 2048 rows of log data each time (of course, this is the default configuration of Filebeat). The output plugin will be processed in a very short time, but every two times, it will be between 20 seconds and 30 seconds. I changed the log level of Logstash to DEBUG, and found that many of logs like this were printed in 20 seconds:

[2019-01-21T08:58:52,131][DEBUG][logstash.pipeline ] "MY_TEST_LOG_CONTENT", "offset"=>82839449, "host"=>{"name"=>"localhost.localdomain"}, "@version"=>"1", "beat"=>{"name"=>"localhost.localdomain", "version"=>"6.4.0", "hostname"=>"localhost.localdomain"}, "tags"=>["beats_input_codec_plain_applied"], "source"=>"/home/logstash/app/testlog_10k/test1.log", "input"=>{"type"=>"log"}}}

After that, there will be a log like Pushing flush onto pipeline. I think that within 20 seconds of each interval, Logstash's beats input plugin is processing the received log data, taking 10KB of logs per line as an example, 2048 rows of data is just 20MB, why the beats plugin handling this too slow?

I used Wireshark to view the network transfer between Filebeat server and Logstash server, and found that 20MB of data was transferred in 1 second to 2 seconds. Similarly, i used the Logstash's file input plugin to test it, using the same standard log file, Logstash processes the data very quickly.

So I think that the beats input plugin will be very slow when dealing with large log files. What can I do to avoid this? Hoping to get your help

Here is the environment I use:

Server: RHEL 6.8 with CPU 8 cores

Version: Filebeat 6.4 and Logstash 6.4

PriestTomb · February 19, 2019, 5:35am

Hello everyone, I debugged the code and found the problem. The V2Batch class of the beats plugin has a problem with the allocate logic.

if (internalBuffer.writableBytes() < size + (2 * SIZE_OF_INT)){
    internalBuffer.capacity(internalBuffer.capacity() + size + (2 * SIZE_OF_INT));
}

This code will cause capacity method to be executed each time a message is received. This operation will take longer and longer.

An issue about this problem is open on github(Too many alloc/memcpy in V2Batch).

system · March 19, 2019, 5:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance issues with input-beats plugin Logstash	6	5690	July 6, 2017
Filebeat slower than logstash default "file" input? Beats filebeat	5	1342	November 11, 2016
Filebeat sending data to Logstash seems too slow Beats filebeat	20	22407	June 1, 2017
Logstash big files Logstash	10	3969	December 15, 2016
Filebeat unable to cope with incoming logs Beats filebeat	7	1537	February 8, 2018

Logstash 6.4 use beats input plugin to collect large log file, but too slow

Related topics