AWS Elasticsearch Best Practice for Indexes

We have been using AWS Elasticsearch Service for a while ( few years ) with the current configuration:

  1. No Master dedicated Nodes.
  2. 2k+ indexes and 10K primary shards, 20K shards with one replica.
  3. 4 instances 8GB memory data node.
  4. the total data size is about 40GB.
  5. We were still running ES2.3, which is a very old version already.

Things worked smoothly until AWS decided to run maintenance and it would bring our cluster up and down for days, never finishing the migration and lowering the availability.

I have few questions:

  1. I saw a lot of posts saying that as long as each shard is under 30GB, it is fine. But due to some business logic, we have to create a lot of indices and hence a lot of small shards. Would that impose performance penalties? What is a good strategy?
  2. Should we upgrade to higher version of ES given the current situation as long as they enjoys better API functions?
  3. Any other general advice?

Thank you so much in advance!

Upgrading is recommended. Also have a look at this blog post about sharding.

Welcome!

Not related to your question but did you look at https://www.elastic.co/cloud and https://aws.amazon.com/marketplace/pp/B01N6YCISK ?

Cloud by elastic is one way to have access to all features, all managed by us. Think about what is there yet like Security, Monitoring, Reporting, SQL, Canvas, APM, Logs UI, Infra UI, SIEM, Maps UI and what is coming next :slight_smile: ...

1 Like