Matching Logstash and Elasticsearch data stream configuration

Hello,

can someone shed some light into configuration of logstash/data streams?

I have a bunch of apps (kubernetes pods) and I need to store logs of some for one year and the rest just for one week. I found out that by setting %{data_stream.type}, %{data_stream.dataset}, %{data_stream.namespace} fields I can select the name of the data stream to route the logs to.

So I need to have at least two data streams: one with one week log retention and one with one year retention.

Now how would I define the properties of the data streams on Elasticsearch side? As per documentation the data stream requires some index template. WTF? So what would I use as index pattern and why? In the Logstash configuration I do not specify (I even CAN'T) the index(es) the logs would be routed to. When I specify index => '....' in the output section together with data_stream => 'true' Logstash won't even start due to configuration error complaining about the index being specified.

OK, I got it. I have to create index template matching indices <type>-<dataset>-<namespace>, so in my case I create one index template named logs-year-<namespace> which will match logs-year-<namespace>* index pattern and then logs-week-<namespace> which will match logs-*-<namespace>* with lower priority than the logs-year-<namespace> template. In the Logstash filter I set [data_stream][dataset] field to 'week' or 'year' based on the required retention for the log entry.

1 Like

Welcome to our community and thanks for sharing your solution :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.