Performance Filebeat 5.0 vs 5.4

Hi,

I have filebeat 5.0 installed and running.
I am testing the upgrade of Filebeat to release 5.4
Service is Windows Server, The filebeat accesses the log files using Shares on Windows.

While comparing the performance of releast 5.4 with 5.0, There is about 5% performance degradation between 5.0 and 5.4

I ran two processes, one of version 5.0 and the other of version 5.4
Regarding the Elastic, sending data to Elastic, and index, was pretty much the same from both instances.
Only the 5.4 used more CPU than 5.0

What can cause that performance degradation between versions ?
How can I be sure that my tests are correct ?

Thanks,

Ori

Has anyone came into a performance issue, upgrading from Filebeat 5.0 to 5.4 ?

Ori

Between 5.0 to 5.4 we mainly added features and fixed bugs. This in general means there are a few more if / else clauses and validation checks which could mean a slight performance impact. This is only theory from my side to potentially explain a slight difference. It will also heavily depend on the configs you use, for example multiline etc. Also 5.4 is shipped with a different Golang version which potentially could also have an affect on performance. So far we did not have any feedback about performance degradation. We have on our todo list to do one day "comparable" benchmarks but we don't have it yet.

It would be interesting to hear what your benchmarks see with 6.0 as there we did some fundamental changes on how events are forwarded in filebeat and are planning further changes which could potentially mean a performance improvement.

In general it is to mention that often filebeat uses resources in a one digit area. So if filebeat uses 3% of the CPU, a 5% difference is hardly to see. Don't get me wrong here, 5% degradation is definitively not something we want but it's more to explain that it could be there but in most cases it will not be visible.

Hi Ruflin,

Thanks!!!!
We are running a dedicated Filebeat server which runs about 22 Different Filebeat services.
Each service reads a certain log type from 5 different application servers through a network share.

The CPU Average is between 60% to 80% Usage.

So What I am afraid for, is upgrading all those at once to newer release and than on a heavy load, the Filebeat server will starve for CPU.

Ori

Hi Ruflin,

Each Filebeat service is in charge of a different log type.
22 Different Filebeat Services on a Dedicated Filebeat machine (Windows OS).
The prospectors are accessing to Application Servers which are also Windows Machine using a Network share to each of the Application server.

Each Filebeat YML, has number of Prospectors as the number of Application Servers.
In our QA env, there are 4 , in our Prod there are 5.

In our QA env, I compared Filebeat version 5.0 to 5.4 against the two most resource consuming logs.
I got a 5% degradation for the 5.4 Filebeat version compared to 5.0 Filebeat version.

like 35% Average CPU Usage over time for 5.4, agains 30% Average CPU Usage over the same time for 5.0
Both accessed the same files at the same time.

Therefore I need to know how safe it will be to replace 22 different filebeat services from Version 5.0 to 5.4
This was our QA, so for Prod, there will be more data generated and the Filebeat services may consume a lot more CPU resources.

Need help to understand!!!

Ori

Are you still reading files from a shared volume? Could you share your configs?

Hi, Yes still reading from Shared Volumes.

Config File:
filebeat:
registry_file: c:/Program Files/Filebeat/registry/registry_log_type_1
prospectors:
# log_type_1
-
paths:
- \myapp01\Logs/log_type_1/file*.log*
input_type: log
multiline:
pattern: ^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[[:space:]][[:digit:]]{1,2},[[:space:]][[:digit:]]{4}[[:space:]][[:digit:]]{1,2}:[[:digit:]]{2}:[[:digit:]]{2}[:space:]
negate: true
match: after
exclude_lines: [ ^* ]
ignore_older: 2h
close_inactive: 20m
document_type: log_type_1
harvester_buffer_size: 16384
encoding: plain
fields_under_root: true
fields:
app_env: QA
type_id: 110
time_to_keep: 43200
hostname: myapp01

# log_type_1
-
  paths:
    - \\myapp02\Logs/log_type_1/file*.log*
  input_type: log
  multiline:
    pattern: ^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[[:space:]][[:digit:]]{1,2},[[:space:]][[:digit:]]{4}[[:space:]][[:digit:]]{1,2}:[[:digit:]]{2}:[[:digit:]]{2}[[:space:]](AM|PM)
    negate: true
    match: after
  exclude_lines: [ ^\* ]
  ignore_older: 2h
  close_inactive: 20m
  document_type: log_type_1
  harvester_buffer_size: 16384
  encoding: plain
  fields_under_root: true
  fields:
    app_env: QA
    type_id: 110
    time_to_keep: 43200
    hostname: myapp02

# log_type_1
-
  paths:
    - \\myapp03\Logs/log_type_1/file*.log*
  input_type: log
  multiline:
    pattern: ^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[[:space:]][[:digit:]]{1,2},[[:space:]][[:digit:]]{4}[[:space:]][[:digit:]]{1,2}:[[:digit:]]{2}:[[:digit:]]{2}[[:space:]](AM|PM)
    negate: true
    match: after
  exclude_lines: [ ^\* ]
  ignore_older: 2h
  close_inactive: 20m
  document_type: log_type_1
  harvester_buffer_size: 16384
  encoding: plain
  fields_under_root: true
  fields:
    app_env: QA
    type_id: 110
    time_to_keep: 43200
    hostname: myapp03

# log_type_1
-
  paths:
    - \\myapp04\Logs/log_type_1/file*.log*
  input_type: log
  multiline:
    pattern: ^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[[:space:]][[:digit:]]{1,2},[[:space:]][[:digit:]]{4}[[:space:]][[:digit:]]{1,2}:[[:digit:]]{2}:[[:digit:]]{2}[[:space:]](AM|PM)
    negate: true
    match: after
  exclude_lines: [ ^\* ]
  ignore_older: 2h
  close_inactive: 20m
  document_type: log_type_1
  harvester_buffer_size: 16384
  encoding: plain
  fields_under_root: true
  fields:
    app_env: QA
    type_id: 110
    time_to_keep: 43200
    hostname: myapp04

logging:
level: debug
to_files: true
to_syslog: false
files:
path: c:/Program Files/Filebeat/logs
name: filebeat_log_type_1.log
rotateeverybytes: 20971520
keepfiles: 100

output:
elasticsearch:
hosts: ["elastic01:9200", "elastic02:9200"]
index: "my-logs"

bulk_max_size: 50, Default

bulk_max_size: 10000

flush_interval, Default unknown

flush_interval: 60
parameters.pipeline: "my-pipeline"

I assume you are still aware that we do not recommend / support shared volumes :wink: At the same time happy to hear that it seems to work.

Couldn't spot anything special in the config files. If you are worried about increase usage of filebeat I would recommend sticking with 5.0 if there isn't a feature or but that you critically need. As there are some changes in 6.0 which could make things more efficient perhaps you can upgrade directly to 6.0 as soon as it is out.

As so far this is the only report about performance degradation it would be interesting to hear from others which upgraded. Also it would be interesting to hear if this is potentially related to shared volumes or if it also happens with local volumes.

Hi Ruflin,

Can you suggest on a Testing method with which I can better test the Performance between those two releases ?

Thanks,

Ori

I think the best way would be to use exact same machines with the exact same setup, not shared drives and output as files.

As mentioned before, there are potential reasons for a minor loss in filebeat 5.4 or 5.0 as more features were added. There is a publisher refactoring for 6.0 happening. In case there is a small performance loss in 5.4 and this affects your use case, I would recommend stay on 5.0 if you can and wait for 6.0 to do more comparisons.

If you want to debug it in detail on what uses how much resources, you could have a look at enabling -httpprof and dig in their.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.