Best practice to add more indices

"error":"validation_exception","reason":"Validation Failed: 1: this action would add [2] shards, but this cluster currently has [999]/[1000] maximum normal shards open;"

Hi, I've a requirement where I need to add more indices. I really don't want any replication, but rather more nodes which can distribute the indices load and still act like a single DB entity where client can make query to main node and everything works.

Without replica shards you do not have any high availability. Are you OK with downtime, partial results and risk of potential data loss?

How much data do you have in the cluster at the moment? What is your average shard size?

What is your sharding strategy given that you have reached the default maximum limit?

Have you read the official documentation on the topic of shard sizing?

It is also always good to state which version of Elasticsearch you are using. Handling of large number of shards has improved in more recent versions compared to older ones.

It's our first time using ElasticDB into project. So I'm not so sure how to answer to these questions until I see them in action.

So currently we are creating 7 indices per client. So total indices in a Node is going to be number of clients we add this feature too.

Now when the Elastic limit hits we need to give add probably an Elastic Node or bump up the hardware resources to keep adding more clients. But from logical / business side, we still need to treat that Elastic Deployment as a single Unit.

How many clients do you need to support? Elasticsearch does generally not scale well with large number of indices for the type of multitenant deployments that you describe.

As you specify that there are exactly 7 indices per client, does this mean that there are 7 types of data with known and controlled mappings? Why are you setting up tenant specific indices instead of having tenants share indices and control access through the application? Do these 7 indices have any conflicting mappings or could they be combined into one?

Most of the data I asked for is available in the cluster stats API so it would be great if you could provide that.

Since we are building an ETL from Postgres DB schemas to Elastic DB indices, each tenant having a separate index is important other wise 2 document with same index id will show up.

Yes the mappings are controlled, then after each client having 7 index helps in feature wise separation of various UI flows.

So total number of clients will be equal to total number of postgres schemas. (100+)

I'll check the API output and the docs you shared to get more understanding.

My primary idea is to use up more CPU cores in machine and dedicate a Node to it. Again I'm just thinking out loud here.

If you are looking to support only a couple of hundred tenants the approach may work but if the number is expected to grow a lot I would not expect it to scale well. You can override the shard count limit per node, and increasing this a bit is likely not going to cause any problems. I would however make sure you have enough RAM (see docs I linked to earlier) and run Elasticsearch on dedicated hardware.