My dataset is very small (+- 7gb) and our machines are big enough to keep everything on memory (32gb machines), but we do tons of queries on this dataset. I'm trying to find the the best cluster setup for this scenario.
Should we keep the default 5 shard configuration or go down to 1 shard? The shard is important if your server can't keep all the dataset on memory, right? In our case, we can keep on memory.
Or should i keep the 5 shards, but stay with them in all machines?
The idea is if everything in the same shard, we has less wast of time with shards data merging during query.