Using Filebeat with the rollover pattern

sterago · October 29, 2018, 8:38am

Hi,

my team and I are looking for a solution to keep the size of indices below a certain threshold and we are looking at the Rollover API with interest. We are already using Curator for data retention purposes and thought we could add rollover actions there.

One problem we have, though, is with dynamic index names.

We are using Filebeat to ship logs directly to Elasticsearch from tens of different applications running on cloud infrastructure. Each application has its own index name, with monthly suffixes, which simplifies the data retention task.

The question is how do we tell Filebeat that, when creating a new index at the start of each month, it should really create an alias + an initial index that can be rolled over when the size condition matches?

The only solution we have found is to create aliases upfront and point Filebeat to such aliases, but this would need to be done every month and it's not a clean solution.

Is there something we are missing?
Also, how do we create a generic rollover action in Curator that matches all possible indices?

Thanks for any help

Andrew_Cholakian1 · October 29, 2018, 7:27pm

Question, why is an alias required on the write side? What is the read pattern? Have you considered using aliases for the read side? Understanding that will help in answering this question.

BTW, one tool that might help for cases like yours in the future is the upcoming Index Lifecycle Management (ILM) feature in ES, which you can track here will be the best solution for this problem.

theuntergeek · October 30, 2018, 1:39am

Let me respond to this part of your question:

To truly simplify your data retention task, you should probably rethink the need for dynamically named indices with monthly suffixes. Rollover and Curator eliminate the need for adding dates to the index name.

From the official documentation:

PUT /logs-000001 
{
  "aliases": {
    "logs_write": {}
  }
}

For each filebeat index you need to create, you would create one index, with its accompanying alias, and then rollover when the conditions are right, using Curator.

You do not need to create a new index every month when you can just use Rollover. Use creation_date to filter based on when the index was created. Alternatively, you could use field_stats to determine the min_value and/or max_value for the timestamp field in each index.

sterago · October 30, 2018, 9:25am

why is an alias required on the write side?

The write alias would allow Filebeat to be agnostic about rollover operations.

What is the read pattern?

we use index patterns that match all indices for a specific application

Thanks for you input

sterago · October 30, 2018, 9:33am

We thought about removing the monthly suffixes, but we'd still have dynamic index names coming from the application name. Would you suggest to get rid of those as well?

Using the name of the application in the index name gives us couple of benefits.

Developers of different applications can focus just on their own application logs by selecting just the index pattern for that application, e.g. in kibana. We could still achieve this by using a dedicated field in the log documents and then create read aliases that filter documents based on that field, but we are not sure how much of a performance penalty that would be.
Different applications might use the same field names but with different mappings, and keeping separated indices helps avoiding clashes.

Thanks for helping with this.

Andrew_Cholakian1 · October 30, 2018, 4:53pm

I think that yes, for your scenario removing the dynamic names is the only way, but I'll defer to @The untergeek on this one.

sterago · November 1, 2018, 7:41am

I see @Andrew_Cholakian1, ok let's wait to hear from @theuntergeek then

I am keen in understanding how we would should address the possible field mapping conflicts if we went for that solution.

theuntergeek · November 5, 2018, 2:04pm

Yes, remove dynamic index names to fully use Rollover. It doesn't mean you have to live with mapping conflicts, though. You can still create one rollover pattern per data type.

For example, if filebeat1 has the same data type as filebeat8, you could make a rollover alias called foo which points to foo-000001. And if filebeat2 has the same data type as filebeat5, you could make a rollover alias called bar which points to bar-000001.

Keep the data types consistent with each other, no more mapping conflicts. Just configure filebeat to point to the alias for that data type.

sterago · November 12, 2018, 10:10am

Thanks for your input @theuntergeek

We experimented with this solution but we'd end up having as many aliases as the number of applications.

Would it be an option to use ingest pipelines, perhaps with a scripting processor, to create an index alias on the fly? If this is possible, we would be able to retain our per-application index names.

theuntergeek · November 12, 2018, 12:44pm

It would actually be easier to flatten your data more and put more fields into an index than go that route, I think. Otherwise, yes, as many aliases as applications is still what I’d recommend.

system · December 10, 2018, 12:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index rollover using curator Elasticsearch	3	2591	September 27, 2017
Rollover daily indexes Elasticsearch	7	3266	May 30, 2017
Filebeat ILM policy question Beats ilm-index-lifecycle-management , filebeat	9	335	November 4, 2022
Create new indices on each Beats update Beats filebeat , auditbeat	1	313	June 28, 2022
No way to specify filebeat 8.0.0 data stream and index template pattern Beats filebeat	4	2449	March 11, 2022

Using Filebeat with the rollover pattern

Related topics