Retrieve log line given offset

Since the size of our log files is huge, I choose not to store the original log lines, but instead store the filepath and the offset of the log line so as to retrieve the log lines later.

It says here that "(The exported field offset) is the file offset the reported line starts at." However, when I try to retrieve the log line by providing offset, I am actually getting the next log line. It seems that offset is pointing at the end of the log line instead of the beginning.

Do I need to read the log line backwards from offset? Is there a better solution?

Which filebeat version are you using?

Looks like a very unfortunate bug to me. You can follow the issue on github here: https://github.com/elastic/beats/issues/4587

As workaround: The string (in JSON) should be utf-8 encoded. Using offset - byte size of string - 1 (for newline character) should give you the lines start offset. In case you do some event processing in logstash/Elastic ingest pipeline, you can adjust the offset.

1 Like

Hi Steffen,

I'm using Filebeat 5.4.1.

In the source code, these are lines 239-277:

    // Get copy of state to work on
    // This is important in case sending is not successful so on shutdown
    // the old offset is reported
    state := h.getState()
    state.Offset += int64(message.Bytes)

    // Create state event
        data := util.NewData()
		if h.source.HasState() {
			data.SetState(state)
		}

		text := string(message.Content)

		// Check if data should be added to event. Only export non empty events.
		if !message.IsEmpty() && h.shouldExportLine(text) {

			data.Event = common.MapStr{
				"@timestamp": common.Time(message.Ts),
				"source":     state.Source,
				"offset":     state.Offset, // Offset here is the offset before the starting char.
			}
			data.Event.DeepUpdate(message.Fields)

			// Check if json fields exist
			var jsonFields common.MapStr
			if fields, ok := data.Event["json"]; ok {
				jsonFields = fields.(common.MapStr)
			}

			if h.config.JSON != nil && len(jsonFields) > 0 {
				reader.MergeJSONFields(data.Event, jsonFields, &text, *h.config.JSON)
			} else if &text != nil {
				if data.Event == nil {
					data.Event = common.MapStr{}
				}
				data.Event["message"] = text
			}
		}

Line 243 first adds the current message length to state.offset:

state.Offset += int64(message.Bytes)

Then line 259 sets the exported field "offset" to state.offset:

"offset":     state.Offset, // Offset here is the offset before the starting char.

If we change the order of these two sections of code, would it solve the issue?

Thanks. We're still wondering when this bug was introduced :confused:

Feel free to open a PR: https://github.com/elastic/beats/pulls
Also see contribution guide if you want to provide a PR with fix and system test.

This topic was automatically closed after 21 days. New replies are no longer allowed.