We are investigating the paid solutions with elasticsearch.
We saw that the first option in the cloud is a 45 Gb storage, and only 1 Gb of RAM.
Because we have some apps that have more than 1 million of docs (and storage size of around 200 Mb each), we are wondering what can we do with only 1 Gb of RAM ? It seems very low.
The nice thing with cloud is that's easy to scale up as you need.
And it depends on your use case. For my use case, a 1gb instance is enough to index my 1m documents dataset. But I'm the only user doing some business intelligence on this dataset...
First, you're going to need nodes that have enough storage for your documents.
You need to ingest them and typically they'll be a replica. So you'll have two nodes and a copy of each will be on each node....
That's the basics
But then of course you need to account for CPU
And this is where David's saying you must test because perhaps you only query the data once a week. That only takes a little bit of CPU, but maybe you're searching the data a thousand times per second. That will take a lot of CPU.
And finally tuning your schema and your searches can also adjust the amount of CPU you need.
So you need to do your testing. That's the only answer.
So you can load the data up. Do your testing, scale up or scale down and even terminate the cluster while you think about whether it makes sense for you or not.
When you terminate the cluster your data will be deleted as well.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.