Performance implications of multiple primary shards for one index on the same data node

Hi there,

I have an Elastic Cloud cluster running v7.10.2 with three data nodes, three dedicated masters and three coordinates nodes.

One index is currently 150GB, spanning three primary shards (one per data node) with a replication factor of one. This index will soon exceed the recommended practice to keep shard sizes between 10GB and 65GB, and is soon due to increase in size by 10x.

Data in the index is mutable with bursty read/write patterns, and is not currently eligible to be split to separate indexes or defer to e.g. warm storage. ILM or Data Streams are not in use given the non-timeseries and mutable nature.

I'm unclear on the recommended practice with upcoming data growth. Is it advisable to simply increase the primary shard count to keep the per-shard size down to recommended levels, even though the number of data nodes will not increase?

Assuming CPU/memory and general resources on the nodes can accommodate, and the nodes are comfortably within the maximum total shard recommendations, are there any disadvantages of an increasing number of primary shards, for the same index, residing on the same node?

Should I be concerned with a future 1TB index spanning ~20 primary shards (with one replica), on just three data nodes?

Note: I would generally like to increase the data node count in time to distribute load (in particular for bursts), but the baseline load on these nodes is low and Elastic Cloud service has a strange limit where by you cannot increase the data node count past three (one per availability zone) until you first pay for the largest memory size instances available:

  • Note that to increase the number of nodes assigned to an instance configuration you must first scale up to the maximum RAM for that instance type. For example, if the maximum value on the RAM per Node slider for your Elasticsearch data node is 64GB, you need to scale up to that value before you can add additional nodes.

Source: Customize Your Deployment

Welcome to our community! :smiley:

Yes, so that in the future you can scale up the node count. Smaller shards are also easier to manage.

I guess that depends on what you need from the index. Do you have SLAs around response times?

Thanks warkolm.

My preference would be to introduce additional smaller data nodes as the indexes primary shard count increases due to data volume. However adding additional data nodes (past one per availability zone) isn't supported by Elastic Cloud without first upgrading to the largest instance type available. This is a 2-4x price bump which our baseline resource usage won't justify for some time.

So I believe the single index in the TB territory spanning an increasing number of primary shards may be limited to just three data nodes for some time.

90% of single doc upsert and single doc retrieval in < 100ms, 100% in < 1 second. This is the majority (~90%) of activity on the index.

The remaining ~10% of search and aggregation activity isn't time sensitive, but aspirationally sub or single digit seconds on average.

Note that some of this data structure and usage pattern is evolving, but timelines aren't known.

You'd need to resource it appropriately on or off prem, so focus on that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.