Multiple deleted .tmp files left open by filebeat service


(Tony S.) #1

I backed out of posting a question on this thinking it was just my own wrongdoing...now I'm not so certain. (and it still may be)
I have a log file that is being written out by tomcat, fairly regularly, has intermittent breaks.

Here's a modified version of what I have NOW:

filebeat.prospectors:
- type: log
paths:
- /opt/tomcat/logs/application.log
exclude_lines: []
multiline.pattern: ^\d{4}-\d{2}-\d{2}_
multiline.negate: true
multiline.match: after
ignore_older: 30m
tail_files: true
exclude_files: [".zip$"]
fields: {application: "embeddedportal", index: "v4emb"}
enabled: True

I have deleted the file: /var/lib/filebeat/registry
The logfile is being ingested, but not long after running, I have a series of these...and filebeat keeps adding more.

filebeat   6343 30727    root   68r      REG              253,0  10493019     158586 /opt/tomcat/logs/application.log12025895980106140.tmp (deleted)
filebeat   6343 30727    root   69r      REG              253,0  10486122     158590 /opt/tomcat/logs/application.log12025954310868597.tmp (deleted)
filebeat   6343 30727    root   70r      REG              253,0  10486573     158595 /opt/tomcat/logs/application.log12026055144410584.tmp (deleted)
filebeat   6343 30727    root   71r      REG              253,0  10485841     158598 /opt/tomcat/logs/application.log12026100258340372.tmp (deleted)
filebeat   6343 30727    root   72r      REG              253,0  10487221     158602 /opt/tomcat/logs/application.log12026141517669076.tmp (deleted)

So with this being said...Where some have told me it's tomcat causing this, I don't believe that to be true. I am at the latest 6.4 filebeat to date. and tomcat doesn't own the files that are in deleted state. But I do have to say, these .tmp files add up quite quickly. I've tried adding the .tmp to the ignore list...not that it should matter as it doesn't match the entry in paths. section.

Why isn't filebeat letting go of the files after 5 minutes or even whatever I set. There are defaults that are supposed to be enabled by default, (close_renamed, closer_removed, etc)


(Noémi Ványi) #2

You are right, it's not only the fault of Tomcat. Filebeat needs to release the deleted file descriptors. But sometimes it fails to do so. But you can force filebeat to release these FDs.
Have you tried setting close_removed option? https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html#filebeat-input-log-close-removed


(Tony S.) #3

close_removed didn't seem to do the trick, however...
close_timeout set to 5m has appeared to release the file, I no longer have thousands of file handles being left open. now there are only 33 in the last 12h. What I don't understand still is why I only have this problem with one of our tomcat applications and not the others.


(ruflin) #4

Can you share what log rotation mechanism you are using?

Could there a reason that on the tomcat installation you mention Filebeat falls behind for some reason? What FB version are you using?


(Tony S.) #6

Currently using logback from apache tomcat.
ch.qos.logback.core.rolling.RollingFileAppender
Rollover is at 10mb then compressed to a zip.

I am currently on the latest 6.4 was on 6.3.2 and the same problem existed.
My filebeats currently excludes .zip files. for the particular logfile that is affected. So the close_removed should have caught that as it's on by default according to the docs.


(Tony S.) #7

actual config in the webapps logback.xml

    <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
            <file>${catalina.base}/logs/webapp.log</file>
            <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
                    <Pattern>%d{yyyy-MM-dd_HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</Pattern>
            </encoder>

            <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
                    <!-- daily rollover or filesize > 10MB -->
                    <fileNamePattern>${catalina.base}/logs/webapp.%d{yyyy-MM-dd}.%i.log.zip</fileNamePattern>
                    <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedFNATP">
                            <!-- or whenever the file size reaches 100MB -->
                            <maxFileSize>10MB</maxFileSize>
                    </timeBasedFileNamingAndTriggeringPolicy>
                    <!-- keep 30 days' worth of history -->
                    <maxHistory>30</maxHistory>
            </rollingPolicy>
    </appender>

(Tony S.) #8

filebeat:

  • type: log
    paths:
    • /opt/tomcat/logs/webapp.log
      exclude_lines: []
      multiline.pattern: ^\d{4}-\d{2}-\d{2}_
      multiline.negate: true
      multiline.match: after
      ignore_older: 1h
      close_timeout: 5m
      exclude_files: [".zip$",".tmp$"]
      fields: {application: "webapp", index: "v4emb"}
      enabled: True

tried multiple variations on the .zip$ vs .zip nothing really changed there. Also tried excluding the .tmp files which didn't work either.


(ruflin) #9

If I read the rollover policy correctly it goes directly from the .log to the .zip file without any intermediary .log.1 or similar files. I remember seeing an other case with a similar rollover pattern which also had such an issue but only on one machine.

Is the machine where you have the issue under more load and log rotation happens more quickly? Could you share some stats from Filebeat? (you see them every 30s in the logs)


(Tony S.) #10

I suppose load could be an issue, but seems to be a problem no matter what time of day. I'll check the logs quick. Also to note that I can stop filebeat and as soon as I start it, about 10 seconds in, there are immediately about 10-30 already deleted and abandon'ed and it increments from there about every 5 to 8 minutes seconds.


(Tony S.) #11

The log doesn't show these files open but they are:

filebeat 31128 root 6r REG 253,0 10486158 156125 /opt/tomcat/logs/webapp.log14426486478728623.tmp (deleted)
filebeat 31128 root 9r REG 253,0 10490290 156121 /opt/tomcat/logs/webapp.log14426525027080023.tmp (deleted)
filebeat 31128 root 10r REG 253,0 10487135 156124 /opt/tomcat/logs/webapp.log14426601128410056.tmp (deleted)
filebeat 31128 31129 root 6r REG 253,0 10486158 156125 /opt/tomcat/logs/webapp.log14426486478728623.tmp (deleted)
filebeat 31128 31129 root 9r REG 253,0 10490290 156121 /opt/tomcat/logs/webapp.log14426525027080023.tmp (deleted)
filebeat 31128 31129 root 10r REG 253,0 10487135 156124 /opt/tomcat/logs/webapp.log14426601128410056.tmp (deleted)
filebeat 31128 31130 root 6r REG 253,0 10486158 156125 /opt/tomcat/logs/webapp.log14426486478728623.tmp (deleted)
filebeat 31128 31130 root 9r REG 253,0 10490290 156121 /opt/tomcat/logs/webapp.log14426525027080023.tmp (deleted)
filebeat 31128 31130 root 10r REG 253,0 10487135 156124 /opt/tomcat/logs/webapp.log14426601128410056.tmp (deleted)
filebeat 31128 31131 root 6r REG 253,0 10486158 156125 /opt/tomcat/logs/webapp.log14426486478728623.tmp (deleted)
...

Had to drop some lines but...

2018-10-10T09:24:04.329-0500 INFO log/harvester.go:251 Harvester started for file: /opt/tomcat/logs/webapp.log
2018-10-10T09:24:18.881-0500 INFO [monitoring] log/log.go:141 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":520,"time":{"ms":88}},"total":{"ticks":3460,"time":{"ms":804},"value":3460},"user":{"ticks":2940,"time":{"ms":716}}},"info":{"ephemeral_id":"feaac80e-9d51-43e1-af4e-b6406eea85e2","uptime":{"ms":120014}},"memstats":{"gc_next":17369584,"memory_alloc":15658536,"memory_total":383725488,"rss":450560}},"filebeat":{"events":{"active":2,"added":12981,"done":12979},"harvester":{"open_files":4,"running":4,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":12967,"active":1369,"batches":12,"total":14336},"read":{"bytes":72},"write":{"bytes":874134}},"pipeline":{"clients":5,"events":{"active":4118,"filtered":12,"published":12967,"total":12980},"queue":{"acked":12967}}},"registrar":{"states":{"current":8,"update":12979},"writes":{"success":13,"total":13}},"system":{"load":{"1":0.05,"15":0.13,"5":0.13,"norm":{"1":0.025,"15":0.065,"5":0.065}}}}}}
2018-10-10T09:24:35.061-0500 INFO log/harvester.go:251 Harvester started for file: /opt/tomcat/logs/webapp.log
2018-10-10T09:24:48.880-0500 INFO [monitoring] log/log.go:141 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":600,"time":{"ms":86}},"total":{"ticks":4190,"time":{"ms":730},"value":4190},"user":{"ticks":3590,"time":{"ms":644}}},"info":{"ephemeral_id":"feaac80e-9d51-43e1-af4e-b6406eea85e2","uptime":{"ms":150014}},"memstats":{"gc_next":18828848,"memory_alloc":9659304,"memory_total":452998496,"rss":253952}},"filebeat":{"events":{"active":3,"added":12302,"done":12299},"harvester":{"open_files":5,"running":5,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":12288,"batches":6,"total":12288},"read":{"bytes":54},"write":{"bytes":934078}},"pipeline":{"clients":5,"events":{"active":4117,"filtered":14,"published":12288,"total":12301},"queue":{"acked":12288}}},"registrar":{"states":{"current":9,"update":12299},"writes":{"success":7,"total":7}},"system":{"load":{"1":0.03,"15":0.13,"5":0.12,"norm":{"1":0.015,"15":0.065,"5":0.06}}}}}}
2018-10-10T09:25:18.881-0500 INFO [monitoring] log/log.go:141 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":710,"time":{"ms":111}},"total":{"ticks":4930,"time":{"ms":745},"value":4930},"user":{"ticks":4220,"time":{"ms":634}}},"info":{"ephemeral_id":"feaac80e-9d51-43e1-af4e-b6406eea85e2","uptime":{"ms":180016}},"memstats":{"gc_next":16910592,"memory_alloc":11708888,"memory_total":534101832}},"filebeat":{"events":{"active":-8,"added":14343,"done":14351},"harvester":{"open_files":5,"running":5}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":14336,"batches":7,"total":14336},"read":{"bytes":60},"write":{"bytes":1074584}},"pipeline":{"clients":5,"events":{"active":4117,"filtered":8,"published":14336,"total":14344},"queue":{"acked":14336}}},"registrar":{"states":{"current":9,"update":14351},"writes":{"success":8,"total":8}},"system":{"load":{"1":0.02,"15":0.13,"5":0.1,"norm":{"1":0.01,"15":0.065,"5":0.05}}}}}}


(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.