Extract elasticsearch "index" field from event field


(Suraj) #1

In filebeat for Kafka output, we are able to dynamically select the topic-name using data from the an event field using something like %{[type]}.

In a similar fashion, is it possible to dynamically select the index name for an elasticsearch output using data from the event field ?


(ruflin) #2

It should work in 5.0 with format strings, but I never tested it TBH. Let me know if it works as expected.


(Steffen Siering) #3

See index and indices settings.


(Suraj) #4

Thanks @steffens , @ruflin!

Another quick question around the same lines.

Would it make sense for filebeat to expose the filename that its reading from in order to determine the index name( or kafka topic for that matter).
For example,
if its currently tailing from a file "/var/log/docker/api.log" then,
include a field in the event{"event_source" : "api" } or even {"event_source" : "api.log" }along with other metadata?

I know that filebeat already exposes an absolute path in the source field, but that is not enough for determining which index / kafka topic to write to if you're doing this at scale. Also, adding a field in the event is going to require change in the way the events are logged which makes the transition to filebeat much more difficult.

What we need is something very similar to : Using filename from filebeat in index pattern


(ruflin) #5

It would be totally possible but for advance processing / routing I recommend to use Logstash in the middle.

The reason we expose the full path and no the filename because the file name is not necessarly unique.


(Suraj) #6

I totally see the worth in exposing the full path.

As more and more large scale organizations start to consider beats as their option for log-tailing, as seen in few other questions on Stackoverflow as well as the Elastic forum, this feature is going to be something that could be really helpful to add.

Additional overhead of maintaining logstash for doing simple extractions/inductions based on either fields in events or path is something that I feel will hamper the adoption of filebeat (or even worse, could potentially lead to adopters maintaining their own versions of filebeat) when used at scale.

Thoughts?


(Steffen Siering) #7

One workaround would be to make use of prospector fields.

e.g. adds create a prospector per file type and set document_type accordingly or use ```

filebeat.prospectors.X.fields:
  source_type: "api"

then you can use %{[fields.source_type]}.

Using indices or topics for kafka one can use conditionals todo some more processing.

But your request makes me think about introducing some kind of template-processors/functions as supported by more common templating systems. This could look somewhat like %{[source:basename]} or %{[source]|basename}. The former only on fields extracted from events, the second potentially on other value sources. I kind of like the pipe-symbol here. Imagine %{[source]|basename|trimRight('.log')}. Well, just some idea so far. Will have to think more about this.


(ruflin) #8

@surajs Are you just referring the the feature of the file name or more general processing?


(Suraj) #9

@ruflin,

I was originally thinking to extract index/topic name using the basename as an original request.

However, what @steffens mentioned, i can totally see the value of providing a scripting interface to existing metadata.

I'd be happy to contribute on that feature should you feel that we need to add that to filebeat or need to discuss potential use-cases that this request may suffice.


(system) #10

This topic was automatically closed after 21 days. New replies are no longer allowed.