Shards are not allocating to available node


(Francisco Manuel Correia) #1

Hi,

I'm running an 1 node elastic search version 6.2.1 and I'm getting an error with several indexes that fails to be allocated.

 $ curl -s 'http://localhost:9200/_cluster/health?pretty'
  {
  "cluster_name": "docker-cluster",
  "status": "red",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 3,
  "active_shards": 3,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 72,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 2,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 4
}

I have a log of unassigned shards..

$ curl 'https://localhost:9200/_cluster/allocation/explain?pretty' -d '{
  "index": "logstash-logstashuser-2018-09-17",
  "shard": 2,
  "primary": true,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "ALLOCATION_FAILED",
    "at": "2018-09-21T14:39:10.603Z",
    "failed_allocation_attempts": 5,
    "details": "failed shard on node [uco2OGESQfWoGBmKNEzzBg]: failed to create shard, failure AlreadyClosedException[Underlying file changed by an external force at 2018-09-14T15:52:45Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/nodes/0/node.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],creationTime=2018-09-14T15:52:45.161531Z))]",
    "last_allocation_status": "no"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "uco2OGESQfWoGBmKNEzzBg",
      "node_name": "uco2OGE",
      "transport_address": "172.21.0.3:9300",
      "node_decision": "no",
      "weight_ranking": 1,
      "deciders": [
        {
          "decider": "max_retry",
          "decision": "NO",
          "explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2018-09-21T14:39:10.603Z], failed_attempts[5], delayed=false, details[failed shard on node [uco2OGESQfWoGBmKNEzzBg]: failed to create shard, failure AlreadyClosedException[Underlying file changed by an external force at 2018-09-14T15:52:45Z, (lock=NativeFSLock(path=/usr/share/elasticsearch/data/nodes/0/node.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],creationTime=2018-09-14T15:52:45.161531Z))]], allocation_status[deciders_no]]]"
        }
      ]
    }
  ]
}

And I get that the problem is related to something that was done to the node.lock file that make it invalid to activate/allocate any shard...

My question is about how can I proceed to not loose any information (I know that if I restart the elasticsearch the problem will be solved but all the unassigned shards will be lost, they never get really active because of this error).

I've pretty tried "everything" to recover this :slight_smile:

Any help is appreciated...


(Florian Kelbert) #2

Hi Francisco,

Can you try to spin up a second node to see if the remaining shards would be allocated on this second node? You wouldn't even need a second physical or virtual machine for that. Just start another Elasticsearch instance on the same node. The easiest way to do this is to download the Elasticsearch tarball into any user directory, extract it, configure it, and run it.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.