Hi, I'm planning to use an elastic cloud for my project but I have some confusion about what is the right way to configure my elasticsearch architecture, according to price calculator https://cloud.elastic.co/pricing in order to configure a cluster with 4 data nodes I should buy 4 nodes with 58 GB of memory each but it's way too many resources for my needs which will cost me a lot of money if we take a look at AWS ES there I can buy 4 nodes with 8 GB of memory each, which is perfect for me, with elastic cloud I can't configure such cluster because of a threshold of minimal 58 GB memory.
My question is what solution will be the best in terms of performance and fault tolerance:
Set up 4 nodes with 8 GB of memory each on AWS ES and if one node falls down another 3 will make a job (P.S. I understand limitations of AWS ES and they don't bother me)
Set up 1 node with 58 GB of memory on elastic cloud but here I don't know what will happen if this node falls down? will my data be transferred to another 58 GB node automatically, which can cause downtime, or I will be waiting for current node recovery?
Why not starting smaller and see how much HEAP/nodes you really need?
I mean that IMHO, if the data indexed on disk is 40gb, you can most likely use a single primary shard or 2 to hold the data. Then, 2 shards could fit in few gb of HEAP. I'd start with a cloud instance of 4gb, test and if it does not work, increase to 8gb (4gb of HEAP)...
Then, I'd probably use 2 zones. So something like:
I think you are right here. One question I would like to clarify is if I choose to set up an elastic cloud on AWS will I be charged for data transfer between aws rds, s3, dynamodb and elastic cloud? If my AWS resources located in the same region as an elastic cloud then I shouldn't but I may be wrong.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.