Version
8.18.4
Operating System
macOS (build & test)
Steps to Reproduce
1. Configuration
filebeat.inputs:
- type: filestream
id: filebeat-test
paths:
- /path/to/log/*
parsers:
- include_message.patterns: ["ERROR", "END_OF_LOG", "SQLException"]
- multiline:
type: pattern
pattern: '^\['
negate: true
match: after
timeout: 10s
flush_pattern: 'END_OF_LOG'
2. Sample Log
[ERROR] something
this line should be dropped
END_OF_LOG
[ERROR] SQLException occurred
java.sql.SQLException: ...
at com.example.A()
at com.example.B()
END_OF_LOG
[ERROR] SQLException occurred
java.sql.SQLException: ...
at com.example.C()
at com.example.D()
END_OF_LOG
3. Execution
- Run Filebeat
- Let it process the file
- Stop Filebeat
- Restart Filebeat
Observed Behavior
- Filebeat re-reads already processed data
- Registry offset is smaller than actual file position
- The resulting registry offset was 165, while the actual log file size was 285 bytes
- Duplicate events are generated after restart
Expected Behavior
- Registry offset should match the actual file read position
- No duplicate events after restart
Root Cause Analysis
The issue occurs when include_message and multiline are used together.
include_messagestores discarded bytes inMessage.Offsetfilestreamupdates registry offset using:
offset += message.Bytes + message.Offset
However, during multiline aggregation:
File: libbeat/reader/multiline/message_buffer.go
func (b *messageBuffer) addLine(m reader.Message) {
b.message.Bytes += m.Bytes
b.message.AddFields(m.Fields)
}
m.Offset is not accumulated.
Result
- Offset information from
include_messageis lost - Final
message.Offsetbecomes incorrect - Registry offset becomes smaller than actual file position
- Causes duplicate ingestion after restart
Proposed Fix
Accumulate Offset in multiline message buffer:
func (b *messageBuffer) addLine(m reader.Message) {
b.message.Bytes += m.Bytes
b.message.Offset += m.Offset
b.message.AddFields(m.Fields)
}
Validation
After applying the fix and rebuilding Filebeat:
- Offset is correctly updated
- The resulting registry offset was 285, which matches the actual log file size
- No duplicate events after restart
- Behavior matches expected file read position
Conclusion
The root cause appears to be that multiline aggregation does not preserve Message.Offset.