Filebeat filestream offset mismatch when using include_message with multiline

Version

8.18.4

Operating System

macOS (build & test)


Steps to Reproduce

1. Configuration

filebeat.inputs:
- type: filestream
  id: filebeat-test
  paths:
    - /path/to/log/*
  parsers:
    - include_message.patterns: ["ERROR", "END_OF_LOG", "SQLException"]
    - multiline:
        type: pattern
        pattern: '^\['
        negate: true
        match: after
        timeout: 10s
        flush_pattern: 'END_OF_LOG'

2. Sample Log

[ERROR] something
this line should be dropped
END_OF_LOG
[ERROR] SQLException occurred
java.sql.SQLException: ...
    at com.example.A()
    at com.example.B()
END_OF_LOG
[ERROR] SQLException occurred
java.sql.SQLException: ...
    at com.example.C()
    at com.example.D()
END_OF_LOG


3. Execution

  1. Run Filebeat
  2. Let it process the file
  3. Stop Filebeat
  4. Restart Filebeat

Observed Behavior

  • Filebeat re-reads already processed data
  • Registry offset is smaller than actual file position
    • The resulting registry offset was 165, while the actual log file size was 285 bytes
  • Duplicate events are generated after restart

Expected Behavior

  • Registry offset should match the actual file read position
  • No duplicate events after restart

Root Cause Analysis

The issue occurs when include_message and multiline are used together.

  • include_message stores discarded bytes in Message.Offset
  • filestream updates registry offset using:
offset += message.Bytes + message.Offset

However, during multiline aggregation:

File: libbeat/reader/multiline/message_buffer.go

func (b *messageBuffer) addLine(m reader.Message) {
    b.message.Bytes += m.Bytes
    b.message.AddFields(m.Fields)
}

m.Offset is not accumulated.


Result

  • Offset information from include_message is lost
  • Final message.Offset becomes incorrect
  • Registry offset becomes smaller than actual file position
  • Causes duplicate ingestion after restart

Proposed Fix

Accumulate Offset in multiline message buffer:

func (b *messageBuffer) addLine(m reader.Message) {
    b.message.Bytes += m.Bytes
    b.message.Offset += m.Offset
    b.message.AddFields(m.Fields)
}

Validation

After applying the fix and rebuilding Filebeat:

  • Offset is correctly updated
    • The resulting registry offset was 285, which matches the actual log file size
  • No duplicate events after restart
  • Behavior matches expected file read position

Conclusion

The root cause appears to be that multiline aggregation does not preserve Message.Offset.