Why filebeat is SOOO slow in reading files (from SSD): only got 27MBPs when output to /dev/null console?

Hello,
As comparison, a simple single threaded program can read at 3GBps+ on the same machine.
However, when we allocate 7 cores to filebeat, the maximum throughput i got from filebeat is only about 27MBps. allocating more cores did not seem to help to improve throughput. Any insights on why filebeat is so slow? or it is just filebeat is VERY in-efficient reading files.
I am using filebeat 7.2 and my config file is copied below. The command i used to run filebeat is:
taskset -c 45,46,47,48,49,50,51 filebeat-7.2/filebeat -c filebeat-7.2/filebeat.yml -path.data ./data -path.logs ./logs > /dev/null &

    registry.flush: 1
    queue.mem:
      events: 524288
      flush.min_events: 262144
      flush.timeout: 1s

    filebeat.max_procs: 7
    #=========================== Filebeat prospectors =============================
    filebeat.inputs:

    - type: log

      paths:
        - logs/reader-logs/*
      close_removed: true

      clean_removed: true
      symlinks: true

      multiline.pattern: '^[[:digit:]]{4,}[[:punct:]][[:digit:]]{2,}|^[I|W|E|F|D|T][[:digit:]]{4,}'
      multiline.negate: true
      multiline.match: after

    output.console:
      pretty: false
      bulk_max_size: 262144
      worker: 7

My simple read file program is copied below, and "cat file > /dev/null" is faster my program.

  string line;
  ifstream infile;
  infile.open("mylog");
  while(!infile.eof())
  {
    getline(infile,line);
  }

Btw, when perf on
$ perf stat -p 21186 sleep 30 (my simple readfile program)

Performance counter stats for process id '21186':

  29929.798411      task-clock (msec)         #    0.998 CPUs utilized          
            68      context-switches          #    0.002 K/sec                  
           287      cpu-migrations            #    0.010 K/sec                  
             0      page-faults               #    0.000 K/sec                  
89,788,668,066      cycles                    #    3.000 GHz                    

291,205,032,556 instructions # 3.24 insn per cycle
65,831,832,834 branches # 2199.541 M/sec
1,469,251 branch-misses # 0.00% of all branches

  30.001622575 seconds time elapsed

$ perf stat -p 19409 sleep 30 (fielbeat)

Performance counter stats for process id '19409': (filebeat)

Performance counter stats for process id '19409':

  82839.873719      task-clock (msec)         #    2.760 CPUs utilized          
       531,789      context-switches          #    0.006 M/sec                  
        20,841      cpu-migrations            #    0.252 K/sec                    (100.00%)
       240,580      page-faults               #    0.003 M/sec                    (100.00%)

246,444,887,688 cycles # 2.975 GHz
295,357,314,548 instructions # 1.20 insn per cycle
63,517,014,753 branches # 766.744 M/sec
641,956,934 branch-misses # 1.01% of all branches (100.00%)

  30.013647236 seconds time elapsed

Why filebeat has so many context-switches, cpu-migrations and page faults? I pin-ed both programs on CPU cores.

btw, even when I set up filebeat only read one file, it still has many context switches and page faults, I pined it to 2 CPU cores when I set filebeat read one file:

$ perf stat -p 26946 sleep 30

Performance counter stats for process id '26946':

  39337.198450      task-clock (msec)         #    1.311 CPUs utilized          
       285,855      context-switches          #    0.007 M/sec                    (100.00%)
         1,681      cpu-migrations            #    0.043 K/sec                    (100.00%)
       131,950      page-faults               #    0.003 M/sec                    (100.00%)

116,937,084,408 cycles # 2.973 GHz
194,636,200,866 instructions # 1.66 insn per cycle (100.00%)
41,472,402,829 branches # 1054.280 M/sec (100.00%)
314,597,891 branch-misses # 0.76% of all branches (100.00%)

  30.006805060 seconds time elapsed

Your program just reads a single file and does not seem to do anything with the data. Filebeat seems to monitor multiple files, do multiline processing and also write to console, which can often slow down performance. I therefore do not think it is a fair comparison. I would recommend enhancing your program to do the same amount of work and then compare.

It may still be slower, but I am not sure how important that is. Filebeat can only read as fast as downstream systems can accept the data, and that is often the bottleneck.

@Christian_Dahlqvist, thanks the reply!
In the data I reported above, filebeat monitors 10 files with 5 cores.
I actually also tested the scenario where filebeat monitor only one file with 2 cores allocated to filebeat (see the last case reported above), the throughput is actually a bit less than the number reported above, it might be due to more cores available for work in the first case.
I also tried removing multi-line processing, the throughput improvement is only 5%. As I mentioned i also tried "cat file > /dev/null", which is equivalent to reading a file and writing to console", which is even faster than my simple program (i did not try to optimize my simple program).
My point is that the slowness of filebeat is NOT due to "monitor multiple files, do multiline processing and also write to console", etc.
In my benchmark experiments, the throughput of sending to kafka is very close to the throughput of sending to console, which shows that Kafka/network or downstream systems are not the bottleneck in our production setup.

30MBps vs. 3Gps+ is 100X difference.. I brought this up, thought you guys might want to investigate, and there might be low-hanging fruits. I documented this mostly to contribute back to the community.
We appreciate that filebeat gave us something ready-to-use. However, at the end of the day, if filebeat can not provide the throughput we need, we will have to bite the bullet and develop our in-house replacement. thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.