Elasticsearch 7.10.1 new test cluster - cannot bring up - state not recovered

Hello,

I am trying to bring up a new ES 7.10.1 cluster for testing as follows:

  • node1 - master (node.roles: [ 'master'])
  • node2 - master (node.roles: [ 'master'])
  • node3 - master (node.roles: [ 'master'])
  • node4 - data (node.roles: [ 'data'])

These are independent docker containers that I brought up in this order:

  1. start node1
  2. start node2
  3. start node3
  4. start node4

Next I try setting elastic user's password as follows:

  1. create a temp local file-based super user on node1 via: /usr/share/elasticsearch/bin/elasticsearch-users useradd elastic-tmp -p password -r superuser
  2. use this temp super user to set elastic user's password via PUT /_xpack/security/user/elastic/_password

However when I execute step 2, it fails:

[node1 ~]# curl -k -uelastic-tmp:welcome1 -XPUT https://localhost:9201/_xpack/security/user/elastic/_password?pretty -H 'Content-Type: application/json' -d'{"password": "password" }'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "status_exception",
        "reason" : "Cluster state has not been recovered yet, cannot write to the [null] index"
      }
    ],
    "type" : "status_exception",
    "reason" : "Cluster state has not been recovered yet, cannot write to the [null] index"
  },
  "status" : 503
}

I can list nodes, view the current elected master, via /_cat/nodes?v

[node1 ~]# curl -sk -uelastic-tmp:welcome1 -XGET https://localhost:9201/_cat/nodes?v
ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
<ip1>            2            7   0    0.03    0.07     0.07 m         *      node1-master
<ip2>             2           4   0    0.12    0.13     0.09 m         -      node2-master
<ip3>             2           7   0    0.03    0.04     0.01 d         -      node4-data1
<ip4>             1           7   0    0.03    0.04     0.01 m         -      node3-master

but can not view indexes or shards - these fail with "cluster block exception, service unavailable, state not recovered" errors, which is likely why the PUT fails.

[node1 ~]# curl -sk -uelastic-tmp:welcome1 -XGET https://localhost:9201/_cat/indices?pretty
{
  "error" : {
    "root_cause" : [
      {
        "type" : "cluster_block_exception",
        "reason" : "blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"
      }
    ],
    "type" : "master_not_discovered_exception",
    "reason" : "ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];]",
    "caused_by" : {
      "type" : "cluster_block_exception",
      "reason" : "blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];"
    }
  },
  "status" : 503
}

The cluster health is red:

epoch   timestamp cluster     status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1608676569 22:36:09 es7-test-cluster red      4    1   0 0  0  0    0      0         -         NaN%

Any help would be appreciated.

Thanks.

Resolved.

This property: gateway.recover_after_data_nodes: 3 apparently caused the "Cluster state has not been recovered yet" exceptions since there was only one data node whereas it was waiting for 3 data nodes.

Once I started 2 more data nodes for a total of 3, everything worked.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.