How we can manage to large shards on index?

Hello, community

Issue Overview:

We are encountering an issue with one of our indices, which has entered an alerting state. The shard size for this index is currently 129.27GB, and it continues to increase. The index is configured with 5 shards and 2 replicas, and our cluster consists of 7 nodes.

Request for Guidance:

Could anyone advise on the possible steps to address this issue? Specifically:

  1. How can the growing shard size be managed effectively?
  2. Whether our current settings for replicas and shard count are optimal?
  3. Any recommendations for maintaining cluster stability and performance?

Hello

Welcome to the community.

What's the actual alert?

Is your data volume in this index growing? Rapidly? Any deletes/updates going on for the docs in this index?

btw, 5 primary shards and 2 replicas, so 15 shards in total over 7 (data?) nodes. So at least one node has to have 3 shards, the rest 2. It works of course, but it's a bit unbalanced to my OCD mind.

Output of

GET _cat/indices?v
GET _cat/shards?v
GET /your-index-name/_settings

might be helpful - you might wish to obfuscate your index names if paranoid.

1 Like

you mean one shard is 129gig, that means total index size is 645gig?

you will have to setup new index and/or ILM policy with template to autorotate it.

for example

PUT _ilm/policy/sachin_log
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_primary_shard_size": "20gb",
            "max_age": "365d"
          }
        }
      }
    }
  }
}

PUT _index_template/sachin_log
{
  "index_patterns": ["sachin_log-*"],
  "template": {
    "settings" : {
      "number_of_shards": "5",
      "number_of_replicas": "1"
    }
  },
  "_meta": {
    "description": "Template sachin's log"
  }
}


PUT %3Csachin_log-%7Bnow%2Fd%7D-000001%3E
{
  "aliases": {
    "sachin_log": {
      "is_write_index": true
    }
  }
}

once you execute this it will create template sachin_log which has 5 shard. ILM wil manage it and when shard reaches 20gb it will create new index.
original index will be created at end of PUT statment with sachin_log--000001
next index will be sachin_log--000002 and so on...

Like kevin said if you have seven data node then I would go with seven shard. which is little balance.

Our index is not base on time series data so we can't put ILM policy on that index.

In that case you will need to use the split index API, which will require downtime. Before doing this I would recommend you ensure you have a snapshot created with the snapshot API set up and working in case you run into any issues.

If you do not have a snapshot, take one while the cluster is running. This may be slow but will speed up taking later snapshots as a large number of segments can be reused.

Then you need to stop traffic to the cluster before taking another snapshot to make sure you have captured the latest data. You can then create a new index with a larger number of primary shards (set to a reasonable value based on number of nodes in the cluster and rate of growth). Once this is ready and in green state and you are happy with the document count etc, you can delete the original index. Note that splitting the index will use up a lot of disk space so make sure to test this in a test cluster ahead of time.

Once the original index has been deleted you can redirect your traffic to the new index or create an alias with the name of the old index in order to not have to modify the code. You can then turn traffic back on.

if your index is not Time seriese and if you need to make changes in to that then create index per month/year etc..

Find a uniq time field from your date
for example if I have people data with there birth year. then year is what I will target. and my index will be

people-yyyy
people-2023 ( every person who is born in 2023 will go here)
people-2022 ( every person who is born in 2022 go here)

now you have data view ( people-*) and you can see all data and also you can make changes and your index is not crazy big like what you have.

I would advise against this. It's similar to (but worse than) just splitting the index, increasing the shard count. If you split the index then each shard will end up pretty much the same size, but if you do some kind of artificial splitting (e.g. by birth year) then the distinct indices will have wildly different sizes, and you will eventually have to think about splitting some of them anyway.

1 Like

I guess depends on use case. because in my case once the year is finish we don't have any new data in that index. unless we would want to add new field or update something. which is not that frequent.

That means you have time-series data. But the OP does not.

no mine is not time-series 100%. all regarding some jobs running and when it finished. I create my own _id with combination of some field and we do go back in time and update some record when needed using inline script.

I assume you are allocating data to the correct index name based on a timestamp, e.g. when the job started, and then update recorda over time. This means it is time-series data. Time-series data does not necessarily have to be immutable even tough that is very common and assumed for data streams.

@Christian_Dahlqvist yes exactly, then my understanding of time-series it little off. :slight_smile: