Shrinking an index

Hi,

i have many indexes with 5 primary shards and 1 replica (default).
Since i have only one node and the data the indexes are holding are small, i want to reduce the number of shards.
This is my machine: 64GB ram, ~1TB disk, nvme ssd
One index contains 500mb-1gb of data.
If i dont need an index frequently, i am closing it.

I tried to optimize my node by reducing shards with shrink. But with this option i need to rename my indexes. Therefore i wrote a script that will shrink an index (e.g. indexa-2019 -> indexa-2019_shrinked), remove the original index (indexa-2019) and reindex the shrinked index (indexa-2019_shrinked -> indexa-2019).
I noticed that shrinking is faster than reindexing. How is this possible? And is there a way to speed up this process? (I dont want to use alises on this host)

Best Regards :raised_hand_with_fingers_splayed:

Each shard of an index is made of multiple (immutable) segments. Shrinking works by sharing the segments between the source and destination indices, which is normally very cheap as it can be done by hard-linking the underlying files.

Reindexing involves re-reading each document in the index, re-doing all the analysis, and writing it back to a new index, which is much more work.

In theory one could shrink an index to another index with the same number of shards, effectively performing a rename, but in practice the preferred way to do this is with aliases.

Hi David,
performing a "rename" with shrinking unfortunately failed:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "can't select recover from shards if both indices have the same number of shards"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "can't select recover from shards if both indices have the same number of shards"
  },
  "status": 400
}

Am i doing something wrong?
As far as i understand your post, it should be possible to shrink index with 5 shards to another with a different name but also 5 shards and then remove the original index and shrink to 1 shard with the original name.
Correct?

Thank you!

No, sorry, I wasn't clear. By "in theory" I meant that you could adjust the implementation in Elasticsearch to permit this. As it is today, you cannot do this, because the preferred way forward is using an alias.

argh. :worried: :sweat_smile:
ok, but then i have another question.
if i shrink my indes indexa-2019 to indexa-2019-shrinked and create an alias indexa-2019. would kibana show data twice in discover and visualization or only once. i ask this because i have an index pattern like indexa-*
(i think it should show only once since id is unique but i am not really sure)

I would guess twice, because deduplication is expensive and not always desirable, but I do not know exactly what you are looking at in Kibana, nor what exactly Kibana is asking of Elasticsearch.

Ok. I think i am going to test this.
Thank you very much! :slight_smile: Appreciate it :+1:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.