Unstable Events Rate/Throughput

Daniel_Dalacort · February 19, 2019, 10:55am

Hello!

I saw how filebeat metrics and noticed that event rate and throughput looks like jigsaw.

but it's not because there is nothing to send. Our logfiles grows faster than filebeat sends events to elasticsearch. I made graph to visualize it.

Orange graph is sum of sizes all logs that should be processed by logstash.
Green one is sum of offsets in filebeat registry file.
Delta between this two graphs increases slowly during the day and at night time (when our service not loaded) this graphs have same value. Of course I want to delta would be as less as possible but I think that unstable throughput plays the main role in my problem.

Fail/Error monitoring graphs looks good

I use elasticsearch and filebeat 6.5.4
My filebeat.yml configuration:

filebeat.inputs:
- type: log
  enabled: true
  paths:
  - /data/filebeat/logs/*/beat.*
  json.keys_under_root: true
  json.overwrite_keys: true
  json.add_error_key: true
  close_inactive: 1h

processors:
- drop_fields:
    fields: ["beat", "input", "offset", "prospector", "source", "host"]
- drop_event:
    when:
      not:
        has_fields: ["indexKey"]

setup.template.enabled: false
xpack.monitoring.enabled: true

queue.mem:
  events: 8192
  flush.min_events: 2048
  flush.timeout: 1s

output.elasticsearch:
  hosts: ["url_to_elasticseach"]
  username: "XXXXXX"
  password: "XXXXXX"
  index: "%{[indexKey]}-%{+yyyy.MM.dd}"
  bulk_max_size: 2048
  worker: 4

logging.level: info
logging.to_files: true

Thank you in advance, any help will be very appreciated!

Christian_Dahlqvist · February 19, 2019, 11:04am

Filebeat can only send as fast as downstream systems, e.g. Elasticsearch, is able to accept data, so it is worth looking at how Elasticsearch is performing and verifying whether or not it is the bottleneck. Do you e.g. see high CPU usage on Elasticsearch? Do yo see evidence of long and/or frequent GC? If you look at stats, are merges being throttled? What does disk I/O and iowait look like?

Daniel_Dalacort · February 20, 2019, 1:11pm

Thank you for advice! You're right elastic was the bottleneck.
I monitored our metrics and noticed that some of our old services sent events directly to elastic using a large number of threads (without filebeat).
I've migrated all our services from direct event sending to using filebeat and problem is gone.
Filebeats graphs look much better now.

system · March 20, 2019, 1:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Speed limitations of filebeat? Beats filebeat	14	15261	July 5, 2017
Filebeat slowing to a halt within 20-30 minutes of starting Beats filebeat	4	1403	May 19, 2017
File beat fails to send events to logstash Beats	17	5346	July 5, 2017
Filebeat cannot keep up Beats filebeat	7	1092	July 13, 2018
Filebeat unable to cope with incoming logs Beats filebeat	7	1556	February 8, 2018

Unstable Events Rate/Throughput

Related topics