ElasticSearch for TimeSeries + Analytics - recommended scenario

clandestino_bgd · December 8, 2015, 6:38pm

Hi,
I am building a recommendation engine and collecting user events 1000 per sec on the website (around 10 different event types). The current plan to use ES as event store and logstash as event producer. Latest blogs and discussions around using ES for timeseries data looks really promissing. However I am facing a dilema how to organise my indexes / search since the users are from different time zones.

I have the current "events" and the historical events (up to 3 months in the past):
Both have the same format: user, sessionid, item, event_type, timestamp
I thought to index the "current events" in the daily logstash index, but then I can hit the problem with one session split in 2 indexes. These queries need to be ultra fast, therefore I wanted to have daily indexes.
Does it make sense to have a "current" alias that includes todays and yesterdays index?

For historical events, I need to aggregate per user, item and event type in the last 3 months.
I would need auto-aliasing and auto-purging indexes older than now - 3 months.

Does the whole idea make sense, anybody had something in the past?
Is there an example somewhere where this kind auto-aliasing and auto-purging is implemented?

Many thanks in advance,
Milan

erikstephens · December 8, 2015, 7:11pm

A "current" alias makes sense. I think you'll find that elasticsearch curator will help you address a lot of your index mgmt needs.

Topic		Replies	Views
Rolling indices daily best practices? Elasticsearch	8	10099	July 5, 2017
Time-based indices and automation in node.js Elasticsearch	9	1455	January 13, 2017
ElasticSearch and Time-Series Analytics Elasticsearch	3	397	July 6, 2017
Logstash control records Logstash	9	335	July 6, 2017
Flush data periodically Elasticsearch	9	1269	July 5, 2017

ElasticSearch for TimeSeries + Analytics - recommended scenario

Related topics