Blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]

sblancocr · March 27, 2019, 10:43pm

Hi...
I have a Elastic with 8 nodes (2 client, 3 master, 3 data) and when i ask the healt the system showme:

[root@log-elasticsearch-client-01 elasticsearch]# curl -X GET 'http://log-elasticsearch-client-01.alpha.ci.ucr.ac.cr:9200/_cluster/health?pretty'
{
"cluster_name" : "log-elasticsearch",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 8,
"number_of_data_nodes" : 3,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : "NaN"
}
[root@log-elasticsearch-client-01 elasticsearch]#

The elastic log show this:

[2019-03-27T09:48:42,465][WARN ][o.e.m.j.JvmGcMonitorService] [log-elasticsearch-client-01] [gc][80200] overhead, spent [2.1s] collecting in the last [2.1s]
[2019-03-27T10:59:55,608][WARN ][r.suppressed ] [log-elasticsearch-client-01] path: /.reporting-/esqueue/_search, params: {index=.reporting-, type=esqueue, version=true}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:166) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:152) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.search.TransportSearchAction.executeSearch(TransportSearchAction.java:297) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.search.TransportSearchAction.lambda$doExecute$4(TransportSearchAction.java:193) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:60) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:114) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:87) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:215) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:68) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:124) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:87) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:76) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:403) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.client.support.AbstractClient.search(AbstractClient.java:537) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.rest.action.search.RestSearchAction.lambda$prepareRequest$2(RestSearchAction.java:100) ~[elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:97) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.xpack.security.rest.SecurityRestFilter.handleRequest(SecurityRestFilter.java:72) [x-pack-security-6.6.1.jar:6.6.1]
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:240) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:336) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:174) [elasticsearch-6.6.1.jar:6.6.1]
at org.elasticsearch.http.netty4.Netty4HttpServerTransport.dispatchRequest(Netty4HttpServerTransport.java:551) [transport-netty4-client-6.6.1.jar:6.6.1]
at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:137) [transport-netty4-client-6.6.1.jar:6.6.1]
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at org.elasticsearch.http.netty4.pipelining.HttpPipeliningHandler.channelRead(HttpPipeliningHandler.java:68) [transport-netty4-client-6.6.1.jar:6.6.1]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [netty-transport-4.1.32.Final.jar:4.1.32.Final]
at org.elasticsearch.http.netty4.cors.Netty4CorsHandler.channelRead(Netty4CorsHandler.java:86) [transport-netty4-client-6.6.1.jar:6.6.1]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [netty-transport-4.1.32.Final.jar:4.1.32.Final]

HenningAndersen · March 28, 2019, 9:15am

Hi @sblancocr,

it looks like you lost all shard data for your cluster. Did you maybe repurpose some of the nodes recently? I think this could happen if you originally had all nodes as all roles and then changed configuration as per your description above.

Notice that cluster health status is red, meaning you have indices that miss data. Also notice that active_primary_shards and active_shards is 0, which means that no indices are allocated.

sblancocr · March 28, 2019, 2:56pm

thank..

I'm testing the tool.

That's why I do not know much about her ...

How can I delete the data to lift the service?

HenningAndersen · March 28, 2019, 3:08pm

Hi @sblancocr,

if you really want to wipe all data in the entire cluster, you can stop all nodes and then delete all your data folders (by default these reside in the installation dir, you may have configured them to reside elsewhere using the path.data setting). Be careful, this will delete all data, essentially reverting to a new installation.

Alternatively, simply setting node.data=true on all nodes should recover the missing data. Provided all your indices have at least one replica, you should then be able to repurpose one node at a time, waiting for green cluster health after every node has been repurposed (thereby giving Elasticsearch time to reestablish having two copies of every shard).

sblancocr · March 28, 2019, 4:09pm

Thank you
I'm going to set node.data = true
to try
and see how recovery can be done.

sblancocr · April 9, 2019, 9:29pm

Hello
I did the deletion of all the data and I restarted all the nodes, nevertheless it continues presenting as a state in "red"
I made the following query to see the status of the cluster and I detail it below.

 curl -X GET 'http://log-elasticsearch-client-01:9200/_cluster/stats?human&pretty'
{
  "_nodes" : {
    "total" : 8,
    "successful" : 8,
    "failed" : 0
  },
  "cluster_name" : "log-elasticsearch",
  "cluster_uuid" : "G2YlYLizQYyiwC-Rpc0KYg",
  "timestamp" : 1554844834402,
  "status" : "red",
  "indices" : {
    "count" : 0,
    "shards" : { },
    "docs" : {
      "count" : 0,
      "deleted" : 0
    },
    "store" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size" : "0b",
      "memory_size_in_bytes" : 0,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size" : "0b",
      "memory_size_in_bytes" : 0,
      "total_count" : 0,
      "hit_count" : 0,
      "miss_count" : 0,
      "cache_size" : 0,
      "cache_count" : 0,
      "evictions" : 0
    },
    "completion" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 0,
      "memory" : "0b",
      "memory_in_bytes" : 0,
      "terms_memory" : "0b",
      "terms_memory_in_bytes" : 0,
      "stored_fields_memory" : "0b",
      "stored_fields_memory_in_bytes" : 0,
      "term_vectors_memory" : "0b",
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory" : "0b",
      "norms_memory_in_bytes" : 0,
      "points_memory" : "0b",
      "points_memory_in_bytes" : 0,
      "doc_values_memory" : "0b",
      "doc_values_memory_in_bytes" : 0,
      "index_writer_memory" : "0b",
      "index_writer_memory_in_bytes" : 0,
      "version_map_memory" : "0b",
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set" : "0b",
      "fixed_bit_set_memory_in_bytes" : 0,
      "max_unsafe_auto_id_timestamp" : -9223372036854775808,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 8,
      "data" : 3,
      "coordinating_only" : 2,
      "master" : 3,
      "ingest" : 0
    },
    "versions" : [
      "6.7.1"
    ],
    "os" : {
      "available_processors" : 16,
      "allocated_processors" : 16,
      "names" : [
        {
          "name" : "Linux",
          "count" : 8
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 7 (Core)",
          "count" : 8
        }
      ],
      "mem" : {
        "total" : "29.5gb",
        "total_in_bytes" : 31770730496,
        "free" : "1.9gb",
        "free_in_bytes" : 2128527360,
        "used" : "27.6gb",
        "used_in_bytes" : 29642203136,
        "free_percent" : 7,
        "used_percent" : 93
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 0
      },
      "open_file_descriptors" : {
        "min" : 359,
        "max" : 371,
        "avg" : 365
      }
    },
    "jvm" : {
      "max_uptime" : "30.1m",
      "max_uptime_in_millis" : 1807670,
      "versions" : [
        {
          "version" : "1.8.0_131",
          "vm_name" : "Java HotSpot(TM) 64-Bit Server VM",
          "vm_version" : "25.131-b11",
          "vm_vendor" : "Oracle Corporation",
          "count" : 8
        }
      ],
      "mem" : {
        "heap_used" : "1.4gb",
        "heap_used_in_bytes" : 1568928224,
        "heap_max" : "23.8gb",
        "heap_max_in_bytes" : 25630343168
      },
      "threads" : 239
    },
    "fs" : {
      "total" : "107.9gb",
      "total_in_bytes" : 115880230912,
      "free" : "88gb",
      "free_in_bytes" : 94576070656,
      "available" : "88gb",
      "available_in_bytes" : 94576070656
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "security4" : 8
      },
      "http_types" : {
        "security4" : 8
      }
    }
  }
}

If I do the index query, it shows me

curl -sS -XGET "http://log-elasticsearch-client-01.alpha.ci.ucr.ac.cr:9200/_cat/indices?"
red open metricbeat-2019.03.25 JffV9ODwRY-akZTsEb0c0Q 5 1    
red open .kibana_1             FjS7uuL9RpaebBYYh1m9gg 1 1    
red open metricbeat-2019.03.21 2ADyV09qTg6fWRZsHe_kLA 5 1    
red open metricbeat-2019.03.05 2GeJJBXkSBWXP7wrEH-CHA 5 1

RainTown · April 10, 2019, 6:26am

How did you delete the data? Via elasticsearch / curl -X DELETE ... commands, or just rm on the data node instances ?

btw for learner, maybe better to start with a single-node elasticsearch, and kibana, instances on a laptop and work through some basic cases using sample data.

HenningAndersen · April 10, 2019, 6:37am

Hi @sblancocr,

let us try to find out why those indices are not allocated then. You can use the allocation explain API for that, see: https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-allocation-explain.html

Pick one of the indices and use the explain API to figure out why the primary or replica cannot be allocated. Could be a variety of reasons, like disk space, allocation filters etc.

Also, to double check that all data folders were actually deleted, you can do:

curl localhost:9200/_cat/indices?h=h,s,i,id,p,r,dc,dd,ss,creation.date.string

Your creation dates should then show something after you cleaned out everything.

Maybe I need to clarify too that in order to wipe the cluster completely and start over, you have to shut down all 8 nodes in the cluster and then delete all the data folders (as pointed to by path.data) on all 8 nodes before starting them again. Be warned that this will permanently delete all data.

system · May 8, 2019, 6:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ClusterBlockException Elasticsearch	3	1196	July 6, 2017
[SERVICE_UNAVAILABLE/1/state not recovered / initialized]; Elasticsearch	7	4968	May 28, 2019
Elasticsearch issue Elasticsearch	13	2059	July 6, 2017
ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized] Elasticsearch	11	78079	July 5, 2017
Elasticsearch red status Elasticsearch	6	479	July 6, 2017

Blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]

Related topics