How do I increase or reduce the shard count of an existing index?

Traditionally, once you created an index with a given number of primary shards, it was set until you reindexed your data. That meant that if you hit the limit of documents in a shard, you might have been caught in a bit of trouble.

However, as of Elasticsearch 5.0, you can _shrink, and as of Elasticsearch 6.1, you can _split an index. Thereby increasing and decreasing the number of shards.

Reducing your shard count - aka _shrink

Shrinking is used in ILM, and more generally, for reducing excessing shard counts.

Before you can shrink an index:

  • The index must be read-only.
  • All primary shards for the index must reside on the same node.
  • The index must have a green health status.

The other important thing to take into account is the number of primary shards in the original index, as you can only shrink to a factor of that count. That means for 6 primary shards, you can shrink to 3 or 2 or 1. For 9 it'd be 3 or 1.

Here's an example, adapted from the docs. Let's assume we created an index with 4 primary shards, and one set of replicas;

PUT my_source_index
{
  "settings": {
    "number_of_shards": 4,
    "number_of_replicas": 1
  }
}

Now we want to shrink the my_source_index index, which starts with the 3 prerequisite steps above. You will need to replace shrink_node_name with an actual node name, which you can get with _cat/nodes?v;

PUT /my_source_index/_settings?pretty
{
  "settings": {
    "index.number_of_replicas": 0,
    "index.routing.allocation.require._name": "instance-0000000000",
    "index.blocks.write": true   
  }
}

Then we tell Elasticsearch we want to run the shrink process and output that into a new index called my_target_index, with 1 primary shard;

POST /my_source_index/_shrink/my_target_index
{
  "settings": {
    "index.number_of_shards": 1, 
    "index.codec": "best_compression" 
  }
}

We can check the progress with;

GET _cat/recovery?v

Once it's all done, you probably want to remove the original index, and then create an alias so that anyone still using the old index name doesn't find their data missing;

DELETE my_source_index
PUT /my_target_index/_alias/my_source_index

Increasing your shard count - aka _split

This is really similar to what we did with the shrink, just in reverse. NOTE - if you are using ILM, you probably won't want to use this split functionality. Let the ILM policy rollover to a new index instead.

Before you can split an index:

  • The index must be read-only.
  • The cluster health status must be green.

Like a shrink, we need to expand the shard count by factors. This is influenced by number_of_routing_shards, which is an index setting that can be applied during index creation, or dynamically if the index has been closed (you'd then reopen it after the change). The docs dive into this aspect in more detail.

Using an index similar to one we created above, let's set the index to be read-only;

PUT /my_source_index/_settings
{
  "settings": {
    "index.blocks.write": true 
  }
}

And then run the split process itself;

POST /my_source_index/_split/my_target_index
{
  "settings": {
    "index.number_of_shards": 2
  }
}

Check the progress with;

GET _cat/recovery?v

Again, once it's all done, you probably want to remove the original index, and then create an alias so that anyone still using the old index name doesn't find their data missing;

DELETE my_source_index
PUT /my_target_index/_alias/my_source_index
2 Likes