Very slow cluster, cannot find "_cluster" index?

emilija · September 5, 2018, 12:47pm

Hi, we have a cluster of 34 nodes running Elastic version 2.2 . The cluster started to behave very weirdly now the shard allocation is very slow and there are "URGENT" pending tasks with over 8 hours in the queue. GET _cluster/allocation/explain returns that the root cause is that "_cluster" index is not found.
{
"error": {
"root_cause": [
{
"type": "index_not_found_exception",
"reason": "no such index",
"resource.type": "index_expression",
"resource.id": "_cluster",
"index": "_cluster"
}
],
"type": "index_not_found_exception",
"reason": "no such index",
"resource.type": "index_expression",
"resource.id": "_cluster",
"index": "_cluster"
},
"status": 404
}
Could you guys maybe give some pointers as to how to interpret the fact that "_cluster" index is not found? The logs from master node state that processing of some events time out from time to time or that shards cannot be reallocated to some nodes due to disk threshold. Would prefer to avoid having to restart the whole cluster, if possible.

Bernt_Rostad · September 5, 2018, 1:03pm

The Cluster Allocation Explain API was released with Elasticsearch 5.0 (see this link) so the command you tried running is not working in your 2.2 cluster.

emilija · September 5, 2018, 1:50pm

Sorry, 0 experience with this and stack overflow said that the version.number is version number. Here's more info.
"version": {
"number": "2.2.2",
"build_hash": "fcc01dd81f4de6b2852888450ce5a56436fd5852",
"build_timestamp": "2016-03-29T08:49:35Z",
"build_snapshot": false,
"lucene_version": "5.4.1"
}

Bernt_Rostad · September 5, 2018, 2:13pm

Strange, then you should be able to run the command.

Testing the same call in my 5.6.4 cluster I get an exception too:

user@node-04:~$ curl -X GET "localhost:9200/_cluster/allocation/explain?pretty"
{
  "error" : {
    "root_cause" : [
      {
        "type" : "remote_transport_exception",
        "reason" : "[master2][10.xx.xx.xx:9xxx][cluster:monitor/allocation/explain]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"
  },
  "status" : 400
}

This is because there are no unassigned shards in my cluster, but in your situation it should produce a sensible result, and I don't understand why it tries to interpret "_cluster" as an index name. That is very strange.

Bernt_Rostad · September 5, 2018, 2:18pm

Wait a minute. It's Lucene version 5.4.1. So you are running Elasticsearch version 2.2.2.

Then my initial reply still stands, you can't use the Cluster Allocation Explain API prior to Elasticsearch version 5.0.

emilija · September 5, 2018, 2:33pm

Yup, posted the answer, read documentation some more (the very basics, honestly). You are right. Thank you. Back to reading and looking through various logs for us then.

Christian_Dahlqvist · September 5, 2018, 2:34pm

How many indices and shards do you have in the cluster? How much data?

emilija · September 5, 2018, 2:54pm

_cluster/stats says 5776 indices and 42616 shards, store.size_in_bytes 21998772983623 (almost 22tb).

Christian_Dahlqvist · September 5, 2018, 2:58pm

That is a lot of shards for that data volume. The average shard size is just around 500MB or so. having a lot of small shards can be very inefficient, so I would recommend reading this blog post about shards and sharding practices and then try to reduce this.

emilija · September 5, 2018, 3:01pm

Will do. Thank you!

system · October 3, 2018, 3:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cluster Allocation explain API Elasticsearch	1	577	March 7, 2018
Why Cluster Allocation Explain API explains the allocation of the first unassigned shard? Elasticsearch	3	416	January 24, 2019
Elasticsearch Cluster Yellow - Index Allocation "No Attempt" Elasticsearch	6	1112	August 17, 2023
Unable to find any unassigned shards Beats filebeat	7	5271	August 12, 2019
Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster Elasticsearch	12	7061	December 15, 2022

Very slow cluster, cannot find "_cluster" index?

Related topics