I can't edit my deployment because I have too many shards in my cluster

Hi, community.

Recently I have talked to Elastic Cloud support team and they told me that my cluster fails to edit my deployment because I have too many shards in my cluster. They suggested me to delete unused indexes. So I'm doing that right now and I have found many indexes named .reporting-YYYY.MM.DD. When I tried to delete them using the Kibana UI, it told me that those are System Indexes and by deleting them, it could break Kibana. Is it safe to delete them? I think they don't have relevant information.

Thanks.

It's ok to delete older ones of those, yes :slight_smile:

Thanks, @warkolm. I have deleted those indexes. However, I can't edit my deployment yet. Every time I try to edit the cluster, it fails with the following error

"Finished a few seconds ago and took a minute, but ultimately failed"

When I clicked in Details the only thing I see what is what is in the attached image. I can't see any other log. If I request a GET to /_cat/shards, I started with 1400 shards. Now I have 606 shards. I have deleted everything I can. I'm not sure how many more I have to remove before I can change anything in my deployment.

Do you guys know what else I can do to return to healthy my cluster and perform some changes?

Thanks.

Welcome @daniel_at_gus to the community, sorry you have to join with such a tough issue.

Well one thing you could try after you over run a cluster is all the indices turn read only etc.. etc..
have you tried to make them writeable again?

Just a thought perhaps you already did that, you could / should check with support if they suggest / OK that if you want.

that makes all those system indices writeable again.

PUT /.*/_settings
{
  "index": {
    "blocks.read_only_allow_delete" : false,
    "blocks.read_only": false
  }
}

without the . it will do the same for all the other indices

PUT /*/_settings
{
  "index": {
    "blocks.read_only_allow_delete" : false,
    "blocks.read_only": false
  }
}

Hi, @stephenb.

Thanks a lot for your help.

I've performed the two methods you shared to be sure all indexes are not in a read only status. When I requested the PUT method, both endpoints answered: "acknowledged": true. Sadly, when I tried to do a change in the cluster, it failed again at the same step "Plan successful".

Do you have another clue of what I can do to solve this issue? Thanks.

What is the current configuration in terms of nodes and masters.

What size are they in terms of RAM and type ... Highio etc.

What Are you trying to change from what to what?

Hi, @stephenb. Here it is my current configuration:

  1. Data (aws.data.highio.i3) : 15 GB RAM and 450 GB storage x 1 node x 2 zones = 30 GB RAM 900 GB storage
  2. Master node (aws.master.r4)
  3. Kibana (aws.kibana.r4): 1 GB RAM x 1 instance x 1 zone = 1 GB RAM

I'm implementing a Single Sign On mechanism through AWS Cognito using Open ID Connect. I have my realm setup but I want to change some URLs, like op.authorization_endpoint and op.token_endpoint. Also I want to change the claims.principal to catch the email of the user. Those are the changes I want to apply.

Thanks.

Ok I can't really debug SSO for you.... I thought you were trying to change cluster sizes etc... The failed plan now, Typically when the plan fails fast like it says with 0 milliseconds it means you have a bad configuration / have bad settings. I would reach back out to support tell them you fixed the number of shards and ask for the detail why the plan is change failing.