Not sure how you get this formula. You should at most allocate ~50% disk
space on each node to have some space reserved for temporary segments and
additional shards moving around (recovery). Then you need to estimate the
compression factor, because ES compresses data by default. This can be a
factor of roughly 4 or 5 for usual log files. Example: 2TB disk size and 4
disks in a RAID0 at each machine, there are 4TB index size per node
available, which gives 4 nodes having 16TB, after indexing ~64TB input. The
more nodes the better. I doubt that 4 nodes can handle 120,000 events per
second. Let's assume 10,000 events can be handled by a node. So you may
need 12 nodes, just working 100% for indexing. And between Amazon EC2 and
bare metal servers there is quite a difference, which is also a factor in
the formula.
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.