Rolling indices daily best practices?

We are using ES as a data store for events from devices. By the end of the year I expect to have a few 100 million events being written every day.

My plan is to create two aliases which will get used by clients:

  • An alias ("events-current") that points to the current day's index
  • Another ("events-all") that contains all of the event indices.

To do this I am planning to create a script that will:

  • Export the mappings from the index behind events-current
  • Create a daily index "events-YYYY.MM.DD"
  • Apply the mappings from the previous day's index to the new index
  • Moving removing the previous day's index from the "events-current" alias
  • Adding the new index to both the "events-current and "events-all" aliases

I can do this with shell scripts but there has to be a better way. I'm pretty sure I'm doing the same thing that logstash does by default but wanted to know if I'm missing anything or anyone had a suggestion of a better way to set this up.

1 Like

You can read Curator plugin:

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

"Apply the mappings from the previous day's index to the new index": use templates

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html

But you will still have to do the "CRON" task yourself.

1 Like

Why not just use a template instead?

Does the template control when to create new indices? I thought it just defined the settings for a specific index.

Sorry for the late response, was out of town last week. The mappings may change from day to day, when a device registers it is able to define a semi-arbitrary set of events that it supports, if I used templates I'd have to update the template every time I added a new type of event.

Templates do support pattern matching to some degree but I see your point.

I should mention that you should pay attention to the total number of fields in your indexes. Each one has overhead and letting things just create whatever they want might destabilize things. Its fine if devices don't create more than they need but you'll be vulnerable to mistakes like uuids as field names.

Thanks for the feedback!
I do think I could do something like

  1. Export the existing index's mapping as a template
  2. Create ther new index & make the alias changes
    and it would at least save one step.

We are already using templates to explicitly set some field types we're we've had some confusion of the type of data (boolean/string etc).

There should not be that many event types, maybe 100 or so all together. Longer term we're going to split the indexes by device type as well. My main concern is i don't want to be manually updating the templates every time we add a device or new functionality to an existing device.

Yeah, manual would be a pain. When I maintained some production indexes I used "dynamic": false and added the new properties with a script. I just didn't want anything sneaking in properties I didn't know about. But I added properties much less frequently than it sounds like you will.

1 Like