Can filebeat be configured to output dynamic file names?


(Kevin) #1

I suspect the answer to this is that "filebeat can't do that", but figured I'd ask anyway in case I'm missing some feature.

What I'm trying to do is harvest docker container logs and redirect them to different files based off of some attribute.

For example, if I got the following log line:
{"log":"some output","stream":"stdout","attrs":{"jobid":"11"},"time":"2016-12-08T23:22:40.132270393Z"}

I would like to direct this to an output file named "jobid-11.log", based on the jobid in the above json.

The problems I see with this are:

  1. Filebeat doesn't seem to be able to parse fields into any sort of variable; It's simply got limited parsing ability for purposes of filtering.
  2. Filebeat doesn't seem to be have any concept of variables that can be used to change the output file name.

Even if #1 is true, I could still make this work if I could take something like the "source" output field and redirect to a different file based off of some substring of that (container id), but that would still require #2 to be true.

Is filebeat simply the wrong match for what I'm doing? I'm sure I could get logstash to do this, but I was looking for a lighter weight solution.

Thanks much,
Kevin


(Andrew Kroh) #2

What's the use case for output to a file? You can't dynamically control the output filename, but you can set options for other outputs dynamically.

You can parse structured JSON logs in Filebeat. Then you can choose an ES index or a Kafka topic based on something in the event using a format string.

filebeat.prospectors:
- paths:
    - input.json
  json.keys_under_root: true

# Caution: Not tested.
output.elasticsearch:
  hosts: ['http://localhost:9200']
  index: 'filebeat-%{+yyyy.MM.dd}'
  indices:
    - index: "jobid-%{[attr.jobid]}-%{+yyyy.MM.dd}"

(Kevin) #3

Thanks Andrew. Reading through the documentation more, it seems like something like this should work:

output.file:
  path: '/logs/'
  filename: 'job%{[jobid]}.log'

Where jobid is a field in the event's json.

The thing is, the file output doesn't seem to actually parse that out and just writes literally to 'job%{[jobid]}.log'.

I suspect this is actually an oversight in the implementation, as the docs imply I can use a format string in place of a string field:
https://www.elastic.co/guide/en/beats/libbeat/current/config-file-format-type.html#_format_string_sprintf

My use case is a little odd because our requirements don't allow us to ship these logs off the host that they ran on. I'm just moving them out of the docker container directory so that I can remove containers and still have access to the logs. I need to remove containers earlier than the logs because containers can take up a lot of disk space.

In any event, I'm guessing what I did above should work and either I'm getting the syntax wrong or it wasn't implemented for that specific output (which is described as being "for testing". I'm tempted to look at the code and suggest a fix, but will wait to see if anyone has further comments here first.


(Kevin) #4

I looked into the source briefly, and I was correct.. It's not parsing out any tokens from the name string. Since the fileOutput object uses FileRotator to do the file IO, this would need to change:

func (rotator *FileRotator) FilePath(file_no int) string { 
	if file_no == 0 { 
		return filepath.Join(rotator.Path, rotator.Name) 
	} 
	filename := strings.Join([]string{rotator.Name, strconv.Itoa(file_no)}, ".") 
	return filepath.Join(rotator.Path, filename) 
} 

Where rotator.Name would need to substitute any tokens it matched in the json data.

But this would get messy as it would need to open the (potentially differently named) file each time it did a writeLine instead of once in FileRotator Rotate(), and track the sizes differently, so this is not a trivial change.

OTOH, I got this working with logstash in less than an hour - but with logstash swallowing up 7% of the host's memory due to the JVM, this isn't a realistic option either.

I would be willing to dive in and "fix" this to parse out tokens in the filename if this was likely to get picked up by the project. I'll try opening an issue on the github and see where that goes.


(Steffen Siering) #5

"Format strings" and "string" fields are two different types of strings. The file output is using plain strings only for filenames on purpose, as the file output also implements log-rotation.

Adding support for filenames via "format strings" will complicate the file output I guess. Feel free to open a feature request.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.