Filebeat memory leak suddenly after all the files consumed


(Guanghaofan) #1

I use filebeat to ship data to elasticsearch, there are over 100 files to consume. my setting for harvesters are as below. close the harvester immediately after the EOF, and harvester_limit is 2. it works fine but the memory usage increases suddenly after all the files are consumed. and this case always can be replicated.

close_eof: true
harvester_limit: 2

filebeat log:

    2018-05-15T09:43:04.833+0800    INFO    [monitoring]    log/log.go:124  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":1576890,"time":1576896},"total":{"ticks":12993030,"time":12993044,"value":12993030},"user":{"ticks":11416140,"time":11416148}},"info":{"ephemeral_id":"e456b388-15b3-4fa5-b56b-a43ecdbc2a92","uptime":{"ms":81780013}},"memstats":{"gc_next":226172960,"memory_alloc":143725800,"memory_total":1517236802832}},"filebeat":{"events":{"added":152,"done":152},"harvester":{"closed":76,"open_files":0,"running":0,"skipped":452,"started":76}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":0,"filtered":152,"total":152}}},"registrar":{"states":{"current":468,"update":152},"writes":152},"system":{"load":{"1":1.1,"15":0.87,"5":0.91,"norm":{"1":0.1375,"15":0.1088,"5":0.1138}}},"xpack":{"monitoring":{"pipeline":{"events":{"published":3,"total":3},"queue":{"acked":3}}}}}}}
    2018-05-15T09:43:10.509+0800    INFO    log/harvester.go:216    Harvester started for file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC618_M77ZF.00_WS1_BBFAG051MVC3_11_0_180228_105444.kdf.test_item.log
    2018-05-15T09:43:10.509+0800    INFO    log/harvester.go:239    End of file reached: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC618_M77ZF.00_WS1_BBFAG051MVC3_11_0_180228_105444.kdf.test_item.log. Closing because close_eof is enabled.
    2018-05-15T09:43:10.509+0800    INFO    log/harvester.go:216    Harvester started for file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC605_M77W4.00_WS1_ICWAD004TMA7_17_0_180228_011341.kdf.test_item.log
    2018-05-15T09:43:10.509+0800    INFO    log/harvester.go:239    End of file reached: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC605_M77W4.00_WS1_ICWAD004TMA7_17_0_180228_011341.kdf.test_item.log. Closing because close_eof is enabled.
    2018-05-15T09:43:10.513+0800    INFO    log/harvester.go:216    Harvester started for file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC605_M77W4.00_WS1_ICWAD012TME2_24_0_180228_123639.kdf.test_item.log
....
    2018-05-15T09:43:10.723+0800    INFO    log/harvester.go:239    End of file reached: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC600_M780Q.00_WS1_KBLME098WFA7_15_0_180228_055531.kdf.test_item.log. Closing because close_eof is enabled.
    2018-05-15T09:43:10.729+0800    INFO    log/harvester.go:216    Harvester started for file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC600_M780Q.00_WS1_KBLME146WFF7_5_0_180227_143935.kdf.test_item.log
    2018-05-15T09:43:10.729+0800    INFO    log/harvester.go:239    End of file reached: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC600_M780Q.00_WS1_KBLME146WFF7_5_0_180227_143935.kdf.test_item.log. Closing because close_eof is enabled.
    2018-05-15T09:43:10.735+0800    INFO    log/harvester.go:216    Harvester started for file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC606_M77RS.00_WS1_POOFZ086MXC7_21_0_180227_221740.kdf.test_item.log
    2018-05-15T09:43:10.741+0800    INFO    log/harvester.go:239    End of file reached: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC606_M77RS.00_WS1_POOFZ087MXH1_20_0_180227_204544.kdf.test_item.log. Closing because close_eof is enabled.
    2018-05-15T09:43:10.744+0800    ERROR   log/prospector.go:460   Harvester could not be started on existing file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC620_M77U3.00_WS1_KBKMR088WFG6_22_0_180228_020439.kdf.test_item.log, Err: Harvester limit reached
    2018-05-15T09:43:10.744+0800    ERROR   log/prospector.go:460   Harvester could not be started on existing file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC623_M77XT.00_WS1_KBLQW083WFB2_24_0_180228_093910.kdf.test_item.log, Err: Harvester limit reached
    2018-05-15T09:43:10.744+0800    ERROR   log/prospector.go:460   Harvester could not be started on existing file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC601_M77WJ.00_WS1_KBLEH010WFD7_25_0_180228_081551.kdf.test_item.log, Err: Harvester limit reached
    2018-05-15T09:43:10.744+0800    ERROR   log/prospector.go:460   Harvester could not be started on existing file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC601_M77WJ.00_WS1_KBLEH127WFA3_13_0_180227_142332.kdf.test_item.log, Err: Harvester limit reached
    2018-05-15T09:43:10.744+0800    ERROR   log/prospector.go:460   Harvester could not be started on existing file: /media/ghfan/B6B4B252B4B21539/kdf/20180228/GFDRS_2W5-087608-00_P_ATC606_M77ZC.00_WS1_BBFDZ020MVA6_1_0_180228_065435.kdf.test_item.log, Err: Harvester limit reached

....

  1. Beat version: 6.2.4
  2. Operating System: CentOS
  3. Configuration:
> - type: log
> 
>   # Change to true to enable this prospector configuration.
>   enabled: true
> 
>   # Paths that should be crawled and fetched. Glob based paths.
>   paths:
>     - /media/ghfan/B6B4B252B4B21539/kdf/20180228/*item.log
>     #- /home/ghfan/kdf/*item.log
>     #- c:\programdata\elasticsearch\logs\*
>   scan_frequency: 10
>   # Exclude lines. A list of regular expressions to match. It drops the lines that are
>   # matching any regular expression from the list.
>   #exclude_lines: ['^DBG']
> 
>   # Include lines. A list of regular expressions to match. It exports the lines that are
>   # matching any regular expression from the list.
>   # include_lines: ['^ERR', '^WARN']
>   include_lines: ['^EventType']
> 
> # Exclude files. A list of regular expressions to match. Filebeat drops the files that
>   # are matching any regular expression from the list. By default, no files are dropped.
>   #exclude_files: ['.gz$']
> 
>   # Optional additional fields. These fields can be freely picked
>   # to add additional information to the crawled log files for filtering
>   #fields:
>   #  level: debug
>   #  review: 1
> 

> 
>   # Closes the file handler as soon as the harvesters reaches the end of the file.
>   # By default this option is disabled.
>   # Note: Potential data loss. Make sure to read and understand the docs for this option.
>   close_eof: true
> 
>   # Max number of harvesters that are started in parallel.
>   # Default is 0 which means unlimited
>   harvester_limit: 2

(Guanghaofan) #2

zoom in for the details:

and then I removed all the files, the memory usage goes down a litter only and always in the same level.


(ruflin) #3

We recently found a memory leak in Filebeat that applied when a file was not found. I wonder if this could also apply here? Any chance you could try a snapshot build to see if you can also reproduce it with the snapshot? https://beats-package-snapshots.s3.amazonaws.com/index.html?prefix=filebeat/

How does the memory usage look like if you don't use the harvester_limit config?


(Guanghaofan) #4

while disable harvester_limit, it consumed much much memory since there are over 100 files there. here my case is not a file not found error, all the opened harvester can be closed successfully but only the parallel limit =2. you know it runs very stable before all the files are consumed, however I will try the new snapshot to check if it helps.


(ruflin) #5

Thanks, let us know how it goes.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.