ello,
I am setting up a L-E-K solution to be able to compute statistics about our transactions.
I prepared a VM running logstash instances pulling from redis queues and rerouting structured logs to a elasticsearch 0.90.9 cluster.
We have more or less 15k Tps and we expect to grow (double up).
The indexes follow a template deployed on all nodes.
All nodes can be master.
All nodes contain data.
The problem is... The system becomes slow...
Daily indexes with this setup grow up to 20/30gb each, reaching easily 1billion of documents in 7/10days.
The cluster is composed by 4 * 24 core 96gb RAM systems.
I could add some "powerful" VM (8 core 32gb ram each) without data (we are limited to 40gb disk space).
Should I increase the number of shards and keep replicas to 1?
Since I am continuously feeding the es cluster, maybe would be useful to exploit some no-data nodes to load balance search queries.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.