This is my first post on the forums and I'll be working with Elastic Search also for the first time, so bear with me a little
We have a cloud application that indexes and searches file contents (PDFs, Office, etc) and we currently use the AWS service, CloudSearch.
The service is good, as we don't really need to configure anything other than the fields but the price was starting to become really expensive. So we decided to move to the Elastic Search service also provided by AWS.
Our index size is approximately 30GB (ever growing) and the number of searches is currently small (a couple of dozens per day) but returns a lot of results (1000) without pagination.
So the first thing that is different is that I need to decide the number of nodes, shards, replicas, master nodes, instance types...
I saw some articles talking about a size of 50GB per shard, and that too many on a small instance can be very ineffective.
I was thinking of something like the setup below.
Any advice would be extremely helpful, my biggest doubt is about the number of shards.
- Instance type: t2.medium (2 vCPU, 4 GiB) - As the number of searches are small, I was also thinking about a t2.small (1 vCPU, 2 GiB) initially.
- Two nodes (one master, one replica)
- Two shards (Because of the small instances, I thought too many shards would be bad)
- One replica (For security and availability)