Scripts for fix unassigned shards

Hello Can You point me out what's wrong in below short script

#!/bin/bash

NODE="es_data_1"
IFS=$'\n'
for line in $(curl -u elastic:pass 'http://10.220.111.200:9200/_cat/shards' | fgrep UNASSIGNED); do
  INDEX=$(echo $line | (awk '{print $1}'))
  SHARD=$(echo $line | (awk '{print $2}'))

  curl -u elastic:pass 'http://10.220.111.200:9200/_cluster/reroute' -d '{
     "commands": [
        {
            "allocate": {
                "index": "'$INDEX'",
                "shard": '$SHARD',
                "node": "'$NODE'",
                "allow_primary": true
          }
        }
    ]
  }'
done

at least I've got not desire output


{"error":{"root_cause":[{"type":"named_object_not_found_exception","reason":"[6:22] unknown field [allocate]"}],"type":"x_content_parse_exception","reason":"[6:22] [cluster_reroute] failed to parse field [commands]","caused_by":{"type":"named_object_not_found_exception","reason":"[6:22] unknown field [allocate]"}},"status":400}
------------------------------
UNASSIGNED

Try adding a set -x at the top to enable a bit of debugging.

However you should't need a script to fix things like this. If you are having issues with allocation, perhaps digging into it further will fix the underlying issue?

Yes this script has some bug probably
this is the output:

+ for line in '$(curl -u elastic:pass http://10.220.111.200:9203/_cat/shards | grep UNASSIGNED)'
++ echo UNASSIGNED
++ awk '{print $1}'
+ INDEX=UNASSIGNED
++ echo UNASSIGNED
++ awk '{print $2}'
+ SHARD=

I have a lot of unassigned shards some I need to fix it by reroute with allow_primary": true parameter.

current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "CLUSTER_RECOVERED",
    "at" : "2022-09-08T17:15:35.604Z",
    "last_allocation_status" : "no_valid_shard_copy"
  },
  "can_allocate" : "no_valid_shard_copy",
  "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster",

ok, I've found the bug we can close this case

@INS - I agree with @warkolm , you should not need a script like this. Identifying the reason of these unassigned shards using the cluster allocation explain API should be the recommended approach. This will help you to identify why these shards are unassigned and you can address the problem accordingly.

1 Like

Have You already read all the topic?

It was pointed out why this shards have not been assigned " "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster"," Do You have any ideas, preferences ?

@INS

Have You already read all the topic?

Yes we did.

That is why we highlighted that you should dig into the allocation failure and address the root cause.

Elasticsearch reports that no_valid_shard_copy and cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster - so the question would be to understand what happened to the primary shard? Has the corresponding node left the cluster? temporarily (e.g instability, performance issues) or definitively (hardware failure, node is forever gone, etc.)?

You would need to look as well whether your indices are configured with at least 1 replica shard. Indices without replica shards are not highly available.