We have been using AWS Elasticsearch Service for a while ( few years ) with the current configuration:
- No Master dedicated Nodes.
- 2k+ indexes and 10K primary shards, 20K shards with one replica.
- 4 instances 8GB memory data node.
- the total data size is about 40GB.
- We were still running ES2.3, which is a very old version already.
Things worked smoothly until AWS decided to run maintenance and it would bring our cluster up and down for days, never finishing the migration and lowering the availability.
I have few questions:
- I saw a lot of posts saying that as long as each shard is under 30GB, it is fine. But due to some business logic, we have to create a lot of indices and hence a lot of small shards. Would that impose performance penalties? What is a good strategy?
- Should we upgrade to higher version of ES given the current situation as long as they enjoys better API functions?
- Any other general advice?
Thank you so much in advance!