File was truncated. Begin reading file from offset 0 multiple time

Hi,
I'm new to the ELK stack and currently exploring its features for log aggregation.

We are using tomcat-9 and elk stack 8.5.1 and we faced an issue that the logs are duplicated in Kibana more then 3 or 5 times depends on crawling the log file, this is our log4j config :

		<Appender filePattern="@project.home@/logs/mimou.%d{yyyy-MM-dd}.json.log" ignoreExceptions="false" name="JSON_FILE" type="RollingFile">
			<JSONLayout compact="true" eventEol="true" properties="true" stacktraceAsString="true" includeTimeMillis="true">
				<KeyValuePair key="timestamp" value="$${date:yyyy-MM-dd'T'HH:mm:ss.SSSZ}" />
			</JSONLayout>
            
			<TimeBasedTriggeringPolicy />

			<DirectWriteRolloverStrategy />
		</Appender>

and our filebeat config :

filebeat.inputs:
- type: log
  paths:
  - '/opt/liferay/logs/mimou*.json.log'
  json.keys_under_root: false
  json.add_error_key: true
  json.overwrite_keys: true
  json.message_key: messages
#  close_inactive: 10m
#  clean_inactive: 25h
#  ignore_older: 24h
  fields:
      environment: ${ENVIRONMENT}
      stage: ${STAGE}
      cluster: ${CLUSTER}
  name: filebeat
  tags: ["${ENVIRONMENT}"]
  multiline.pattern: '^[[:space:]]+|^Caused by:'
  multiline.negate: false
  multiline.match: after
logging.to_stderr: true
output.logstash:
  enabled: true
  hosts: ["XX.XXXXXXXXX"]
  ssl.enabled: false
  ssl.certificate_authorities: ["./certs/XXXXXXXXX"]
  ssl.certificate: "./certs/XXXXXXXXX.crt"
  ssl.key: "./certs/secrets/XXXXXXXXX.key"

and the filebeat logs is showing this multiple time :

{"log.level":"info","@timestamp":"2024-12-04T09:02:43.982Z","log.logger":"input.harvester","log.origin":{"file.name":"log/harvester.go","file.line":329},"message":"File was truncated. Begin reading file from offset 0.","service.name":"filebeat","input_id":"4e1d9019-83bb-477d-a4e1-1aa6424721d0","source_file":"/opt/system/logs/moumou.2024-12-04.json.log","state_id":"native::14411563612684419072-1048735","finished":false,"os_id":"14411563612684419072-1048735","harvester_id":"23b2693c-947b-4788-a1b7-17804cf56c19","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2024-12-04T09:02:50.989Z","log.logger":"input.harvester","log.origin":{"file.name":"log/harvester.go","file.line":310},"message":"Harvester started for paths: [/opt/system/logs/moumou*.json.log]","service.name":"filebeat","input_id":"4e1d9019-83bb-477d-a4e1-1aa6424721d0","source_file":"/opt/system/logs/moumou.2024-12-04.json.log","state_id":"native::14411563612684419072-1048735","finished":false,"os_id":"14411563612684419072-1048735","old_source":"/opt/system/logs/moumou.2024-12-04.json.log","old_finished":true,"old_os_id":"14411563612684419072-1048735","harvester_id":"8477a7cb-8230-4298-bf91-fe2edddd5134","ecs.version":"1.6.0"}

the log file is considered as new file each time we got new line and it's crawled from index 0.

Anyone encountered this before?

Thanks in advance

Hi @Mimouz,

this is a common problem with some log rotation approaches as detailed in Log rotation results in lost or duplicate events | Filebeat Reference [8.16] | Elastic. Maybe there is way to configure whatever tool performs the rotation on your system to do so without file truncation?

1 Like

Also, you are using the deprecated log input, please switch your input to using filestream

1 Like

unfortunately the problem is not resolved using file stream:

{......"message":"File was truncated as offset (4245) > size (3040): /opt/mimouz/logs/MY_LOG_FILE_NAME.json.log".......}

{....... "message":"File was truncated. Begin reading file from offset 0. Path=/opt/mimouz/logs/MY_LOG_FILE_NAME.json.log",.........}

the log file has the same inode each time, and we think that the issue is maybe due to that the log files are mounted in a files share.

Please share your updated configuration that is relying on filestream and please share the details of how you have mounted the network share (the mount settings used)