OOM error in Filebeat

(Rohit Singh) #1

We are running filebeat in an docker container with basic configuration and only one logstash output.
Filebeat was running continously for past few days, yesterday it started to comeup and go down continuously.

the filebeat logs stated oom error(filebeat version is 5.4.0)

Following were the logs :

2017-09-25T22:29:06.901445+00:00 worker-1 kernel: filebeat invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), order=0, oom_score_adj=0 2017-09-25T22:29:06.901480+00:00 worker-1 kernel: filebeat cpuset=18ddf254d731651f1a17f07d595cdd5b6ebb7dd5d3d8262a5a32fe3920dc6e0c mems_allowed=0 2017-09-25T22:29:06.901504+00:00 worker-1 kernel: CPU: 7 PID: 92948 Comm: filebeat Not tainted 4.8.3-1.el7.elrepo.x86_64 #1 2017-09-25T22:29:06.921187+00:00 worker-1 kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 02/16/2017 2017-09-25T22:29:06.921226+00:00 worker-1 kernel: 0000000000000286 00000000b6f2080d ffff8802859f7c30 ffffffff81353c9f 2017-09-25T22:29:06.921276+00:00 worker-1 kernel: ffff8802859f7d30 ffff8802f3ee8000 ffff8802859f7cc0 ffffffff81215b00 2017-09-25T22:29:06.921313+00:00 worker-1 kernel: 0000000000000046 ffff8802859f7ca0 ffffffff810ac4f8 ffff8802f3eead00 2017-09-25T22:29:06.921343+00:00 worker-1 kernel: Call Trace: 2017-09-25T22:29:06.921377+00:00 worker-1 kernel: [<ffffffff81353c9f>] dump_stack+0x63/0x84 2017-09-25T22:29:06.921424+00:00 worker-1 kernel: [<ffffffff81215b00>] dump_header+0x5d/0x1ed 2017-09-25T22:29:06.921460+00:00 worker-1 kernel: [<ffffffff810ac4f8>] ? try_to_wake_up+0x58/0x3c0 2017-09-25T22:29:06.921491+00:00 worker-1 kernel: [<ffffffff81192eeb>] ? find_lock_task_mm+0x3b/0x80 2017-09-25T22:29:06.921517+00:00 worker-1 kernel: [<ffffffff81193b05>] oom_kill_process+0x225/0x3f0 2017-09-25T22:29:06.921545+00:00 worker-1 kernel: [<ffffffff81209327>] ? mem_cgroup_iter+0x127/0x2c0 2017-09-25T22:29:06.921574+00:00 worker-1 kernel: [<ffffffff8120b605>] mem_cgroup_out_of_memory+0x2a5/0x2f0 2017-09-25T22:29:06.921603+00:00 worker-1 kernel: [<ffffffff8120c564>] mem_cgroup_oom_synchronize+0x314/0x340 2017-09-25T22:29:06.921632+00:00 worker-1 kernel: [<ffffffff81206fb0>] ? high_work_func+0x20/0x20 2017-09-25T22:29:06.921663+00:00 worker-1 kernel: [<ffffffff8119418c>] pagefault_out_of_memory+0x4c/0xc0 2017-09-25T22:29:06.921692+00:00 worker-1 kernel: [<ffffffff8107d36e>] mm_fault_error+0x6a/0x157 2017-09-25T22:29:06.921721+00:00 worker-1 kernel: [<ffffffff81068c90>] __do_page_fault+0x430/0x4a0 2017-09-25T22:29:06.921750+00:00 worker-1 kernel: [<ffffffff81068d30>] do_page_fault+0x30/0x80 2017-09-25T22:29:06.921779+00:00 worker-1 kernel: [<ffffffff8173bfc8>] page_fault+0x28/0x30 2017-09-25T22:29:06.923826+00:00 worker-1 kernel: Task in /docker/18ddf254d731651f1a17f07d595cdd5b6ebb7dd5d3d8262a5a32fe3920dc6e0c killed as a result of limit of /docker/18ddf254d731651f1a17f07d595cdd5b6ebb7dd5d3d8262a5a32fe3920dc6e0c 2017-09-25T22:29:06.923866+00:00 worker-1 kernel: memory: usage 262144kB, limit 262144kB, failcnt 17384 2017-09-25T22:29:06.923898+00:00 worker-1 kernel: memory+swap: usage 262144kB, limit 524288kB, failcnt 0 2017-09-25T22:29:06.924703+00:00 worker-1 kernel: kmem: usage 4520kB, limit 9007199254740988kB, failcnt 0 2017-09-25T22:29:06.929424+00:00 worker-1 kernel: Memory cgroup stats for /docker/18ddf254d731651f1a17f07d595cdd5b6ebb7dd5d3d8262a5a32fe3920dc6e0c: cache:260KB rss:257364KB rss_huge:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:257352KB inactive_file:260KB active_file:0KB unevictable:0KB 2017-09-25T22:29:06.929468+00:00 worker-1 kernel: [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name 2017-09-25T22:29:06.930647+00:00 worker-1 kernel: [92469] 0 92469 1539 50 8 4 0 0 docker-entrypoi 2017-09-25T22:29:06.930689+00:00 worker-1 kernel: [92853] 0 92853 78637 63665 138 5 0 0 filebeat 2017-09-25T22:29:06.931654+00:00 worker-1 kernel: Memory cgroup out of memory: Kill process 92853 (filebeat) score 973 or sacrifice child

We increased the memory, then it became stable. We have graphs of I/O cpu and memory which indicate sudden rise in all of them. I/O was around 4gb, in normal scenario its around 700 mb to 800 mb.256mb memory is used by filebeat which was sufficient until now.

Question :

  1. did filebeat recieve more data , which it couldn't handle and it oom'ed
  2. Is this docker container issue, as earlier this worked seamlessly

(Andrew Kroh) #2

Please share your Filebeat config and logs.

(Rohit Singh) #3

Hi Andrew, Logs don't have much info, logs have been silenced. we are running filebeat in docker containers,
however Killed message was there in the log .

Following are the config files

Main config file

filebeat.config_dir: /prospectors
filebeat.registry_file: <registry path>
enabled: true
path: prospectors/*.yml
reload.enabled: true
reload.period: 10s

    hosts: ["<logstashendpoint>"]

logging.level: error

Prospectors file

- input_type: log
enabled: true
scan_frequency: 10s
  - /mnt/logs/
exclude_files: ['\.gz$']
document_type: syslog
close_inactive: 1h
filebeat.registry_file: <registry path>

    hosts: ["<logstashendpoint>"]

logging.level: error

(Andrew Kroh) #4

How many files are being monitored by Filebeat?

The info level logs could be useful. The metrics logged at 30 second intervals provide some details as to what's happening over time (how many events, how many readers are running, indicators of back-pressure from LS, etc.)

Is the "prospectors file" being modified at runtime causing a reload to occur?

Side-note: Only the filebeat.prospectors settings are honored from the prospector files so you should remove filebeat.registry, output.logstash, and logging.level.

(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.