There is no right (or wrong) answer. The best way is to test. Start with
the default 5 shards and load real data into it at the rate that you expect
in production. And then query it at the rate that you expect in production
check throughput and response times. Then run your facets, sorts,
aggregations - check throughput, response times, and RAM usage. After a
little but of testing, you should get a good sense of the limits of your
hardware (per node) and then go from there.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.