Hello, I'm about to build a very small cluster and since I'm new to
elasticsearch, I would like to hear your opinion on my current design and
thought process. Basically I need to store and search log files. The logs
should be stored for six months.
At the beginning, there will be only one machine (only parameters I know
are 4GB of ram and 4 cores), additional machine might come in the near
future. There will be around 200m documents every month (~170GB).
Given this setup, I thought of this:
Make one index for every week - i.e. logs_2012_31, logs_2012_32... This
will allow me to query just logs_2012_* (or create an alias?) or only a
small number of indices when given a time range. This should also help when
Every index will have 2 shards (so that I don't need to reindex when
adding the other machine), no replicas and set refresh_interval to -1.
_source should be compressed.
Does that sound reasonable? And can the HW actually handle this load? Thank