Time-based indices and automation in node.js

kevinfez · December 14, 2016, 8:16pm

Hi, I'm new to the elasticsearch scene and have quite a lot of questions.

First of all, my use case at the moment is pretty simple: when a user performs a "registration" action on my site, I want to add a document logging this occurrence so that I can track and visualize in Kibana when and how many users are registering.

After doing a lot of reading, I have the mile-high view of wanting time-based indices. A daily or weekly index seems appropriate, so that way on slow days it'll be OK, and on days where potentially many users register it won't blow up the index.

A common approach seems to be to set up an index template and just let the index be created when you insert (i.e. logs_* template where the * is today's date), but another one is to create the indices yourself (I've come to assume) and set up an alias for "today's index" and another index for the past 3 month for searching.

Unfortunately, I'm limited to AWS ES, so I can't use the fancy new rollover api it seems. Which was a bummer since the blog has a nice post about it.

So, in summary my main questions are:
For my use case, would it be better to rely on auto-generated indices, or try to have more control with aliases?
These documents won't be modified after insert, so are there any optimizations I can take?
How can I manage old indices? Since once the day/week ends, I no longer need to devote any resources to writing to the index anymore, just reading for the occasional search.
If I do go with aliases, since I don't have the rollover api, what's the best way to make sure my aliases are kept up to date and my old indices are removed from that "last 3 months" window?

warkolm · December 15, 2016, 12:21am

I wouldn't worry about aliases with this use case
Nope, ES handles this automatically
Elasticsearch Curator can do a lot of things
Curator can also do aliases, but I wouldn't bother

There are other options

kevinfez · December 15, 2016, 7:54pm

Thanks for the reply Mark,

I've decided auto-generated indices will probably be the easiest for me; I'm going to specify the index as logs_YYYY-MM-DD whenever adding a doc so it'll just be auto-generated for the new days.
This allows me to basically ditch the alias work, like you suggested, I may still end up using some later on when I have additional use-cases that require a lot more documents though, since it would be nice to limit the number of shards being accessed through an alias instead of having to run a query on the entire available/open set of indices and have time be a filter.

I've also been checking out curator, I plan on following this blog post for setting curator up with AWS ES after I've gotten it all working on my local machine.

Thanks!

theuntergeek · December 15, 2016, 10:33pm

I hate to be the bearer of bad news, but Curator doesn't work with AWS ES, whether you're using a lambda or not. See the compatibility matrix.

kevinfez · December 15, 2016, 10:45pm

Hi Aaron,

oh, I did check out the compatibility matrix and am using Curator 3.5, planning on using it on our AWS ES 2.3, I know it says I won't be able to take snapshots, but are you saying it still wouldn't work?

theuntergeek · December 15, 2016, 11:27pm

3.5.1 will work with limited functionality.

kevinfez · December 15, 2016, 11:44pm

well that sounds a bit ominous ; )

in your opinion Aaron, am I going about this wrong? My main goal here to be able to store one-off events, that won't change, from my site and visualize how many are happening and when they are happening with Kibana; with how much warning there is of needing to scale, prevent data loss, and how there can be crippling performance issues if everything isn't perfect from the get go, perhaps I've overcomplicated things by trying to set up all these time-based indices, aliases, and using Curator "optimize" and close indices over time and the such? Perhaps you could suggest some common metric storing architectures or stacks using ES I can look into?

theuntergeek · December 16, 2016, 12:19am

The limits are what you've already discovered: snapshots not going to work as desired.

Otherwise, what you're trying to do sounds just fine.

kevinfez · December 16, 2016, 12:38am

alrighty, thanks for the info

system · January 13, 2017, 12:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Time-based index writing and reading Elasticsearch	6	861	July 6, 2017
Indexing, querying and bulk updating against time-based indexes Elasticsearch	2	938	September 19, 2017
Rolling indices daily best practices? Elasticsearch	8	10095	July 5, 2017
Read/Write alias for Elastic search indices Elasticsearch	4	1330	September 5, 2018
Rollover Alias with date? Elasticsearch	9	6196	May 3, 2018

Time-based indices and automation in node.js

Related topics