Could not reassign UNASSIGNED shards (elasticsearch 5.6 )

Serg · December 25, 2017, 3:08pm

After reboot, i got some unassigned shards.

logstash-2017.12.08 3 p UNASSIGNED
logstash-2017.12.08 3 r UNASSIGNED
logstash-2017.12.08 4 p UNASSIGNED
logstash-2017.12.08 4 r UNASSIGNED
logstash-2017.12.08 2 p UNASSIGNED
logstash-2017.12.08 2 r UNASSIGNED
logstash-2017.12.08 1 p UNASSIGNED
logstash-2017.12.08 1 r UNASSIGNED
logstash-2017.12.08 0 p UNASSIGNED
logstash-2017.12.08 0 r UNASSIGNED

now i am trying to assigned it but i got

root@04elasticsearch:~# curl -XPOST 'http://localhost:9200/_cluster/reroute' -d '{ "commands" : [ { "index" : "logstash-2017.12.08", "allocate" : { "shard" : 3, "node" : "7vuMfymHTTOzxlcTtAkk9g", "allow_primary" : true } } ]}' | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 514 100 368 100 146 62542 24813 --:--:-- --:--:-- --:--:-- 73600
{
"status": 400,
"error": {
"caused_by": {
"col": 31,
"line": 1,
"reason": "Unknown AllocationCommand [index]",
"type": "unknown_named_object_exception"
},
"col": 31,
"line": 1,
"reason": "[cluster_reroute] failed to parse field [commands]",
"type": "parsing_exception",
"root_cause": [
{
"col": 31,
"line": 1,
"reason": "Unknown AllocationCommand [index]",
"type": "unknown_named_object_exception"
}
]
}
}

Serg · December 25, 2017, 8:08pm

After i tried

GET /_cluster/allocation/explain
{
"index": "logstash-2017.12.05",
"shard": 0,
"primary": true
}

"index": "logstash-2017.12.05",
"shard": 0,
"primary": true,
"current_state": "unassigned",
"unassigned_info": {
"reason": "ALLOCATION_FAILED",
"at": "2017-12-25T18:48:27.248Z",
"failed_allocation_attempts": 1,
"details": "failed recovery, failure RecoveryFailedException[[logstash-2017.12.05][0]: Recovery failed on {7vuMfym}{7vuMfymHTTOzxlcTtAkk9g}{e4mS1ZIKS5aJHUA3PHTx3g}{10.3.2.45}{10.3.2.45:9300}{ml.max_open_jobs=10, ml.enabled=true}]; nested: IndexShardRecoveryException[failed to fetch index version after copying it over]; nested: IndexShardRecoveryException[shard allocated for local recovery (post api), should exist, but doesn't, current files: ]; nested: FileNotFoundException[no segments* file found in store(mmapfs(/vol/nodes/0/indices/AgVIeZ2cQji5SSFlnmS8dw/0/index)): files: ]; ",
"last_allocation_status": "no_valid_shard_copy"
},
"can_allocate": "no_valid_shard_copy",
"allocate_explanation": "cannot allocate because all found copies of the shard are either stale or corrupt",
"node_allocation_decisions": [
{
"node_id": "7vuMfymHTTOzxlcTtAkk9g",
"node_name": "7vuMfym",
"transport_address": "10.3.2.45:9300",
"node_attributes": {
"ml.max_open_jobs": "10",
"ml.enabled": "true"
},
"node_decision": "no",
"store": {
"in_sync": true,
"allocation_id": "Wx2RRHWjQUWO38H6u8eXnA",
"store_exception": {
"type": "file_not_found_exception",
"reason": "no segments* file found in SimpleFSDirectory@/vol/nodes/0/indices/AgVIeZ2cQji5SSFlnmS8dw/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@4ea20096: files: "
}
}
},
{
"node_id": "dEcFJ6h_QfyKNr_N7362QQ",
"node_name": "dEcFJ6h",
"transport_address": "10.3.2.39:9300",
"node_attributes": {
"ml.max_open_jobs": "10",
"ml.enabled": "true"
},
"node_decision": "no",
"store": {
"found": false
}
}
]
}

Serg · December 25, 2017, 8:10pm

Also i tried

POST /logstash-2017.12.05/_close
POST /logstash-2017.12.05/_open

and

PUT /logstash-2017.12.05/_settings
{
"index" : {
"number_of_replicas" : 0
}
}

'+'

PUT /logstash-2017.12.05/_settings
{
"index" : {
"number_of_replicas" : 1
}
}

Serg · December 25, 2017, 8:12pm

and

POST /_cluster/reroute
{
"commands" : [
{
"move" : {
"index" : "logstash-2017.12.05", "shard" : 0,
"from_node" : "dEcFJ6h_QfyKNr_N7362QQ", "to_node" : "dEcFJ6h_QfyKNr_N7362QQ"
}
},
{
"allocate_replica" : {
"index" : "logstash-2017.12.05", "shard" : 1,
"node" : "dEcFJ6h_QfyKNr_N7362QQ"
}
}
]
}

but in this case i don't know what should i put in "from node" filed i this index not assign to any server
By the way allocation for all indexes are enabled

zqc0512 · December 26, 2017, 12:44am

this you can slove it as i know.
i think you cluster have many shards.
i test is in my cluster find the slove it fast is close it and open again.

Serg · December 26, 2017, 7:44am

I dont understand , explain please how to fix it

Christian_Dahlqvist · December 26, 2017, 7:49am

What is the full output of the cluster stats API?

Serg · December 26, 2017, 8:36am

{
"_nodes": {
"total": 2,
"successful": 2,
"failed": 0
},
"cluster_name": "prod",
"timestamp": 1514277344003,
"status": "red",
"indices": {
"count": 81,
"shards": {
"total": 409,
"primaries": 205,
"replication": 0.9951219512195122,
"index": {
"shards": {
"min": 1,
"max": 10,
"avg": 5.049382716049383
},
"primaries": {
"min": 1,
"max": 5,
"avg": 2.5308641975308643
},
"replication": {
"min": 0,
"max": 1,
"avg": 0.9876543209876543
}
}
},
"docs": {
"count": 445304538,
"deleted": 26909
},
"store": {
"size": "712.4gb",
"size_in_bytes": 765019382231,
"throttle_time": "0s",
"throttle_time_in_millis": 0
},
"fielddata": {
"memory_size": "12.7kb",
"memory_size_in_bytes": 13104,
"evictions": 0
},
"query_cache": {
"memory_size": "0b",
"memory_size_in_bytes": 0,
"total_count": 0,
"hit_count": 0,
"miss_count": 0,
"cache_size": 0,
"cache_count": 0,
"evictions": 0
},
"completion": {
"size": "0b",
"size_in_bytes": 0
},
"segments": {
"count": 8011,
"memory": "1.8gb",
"memory_in_bytes": 2034123647,
"terms_memory": "1.6gb",
"terms_memory_in_bytes": 1733836573,
"stored_fields_memory": "188.9mb",
"stored_fields_memory_in_bytes": 198105344,
"term_vectors_memory": "0b",
"term_vectors_memory_in_bytes": 0,
"norms_memory": "295.4kb",
"norms_memory_in_bytes": 302528,
"points_memory": "15.9mb",
"points_memory_in_bytes": 16733358,
"doc_values_memory": "81.2mb",
"doc_values_memory_in_bytes": 85145844,
"index_writer_memory": "46.1mb",
"index_writer_memory_in_bytes": 48391088,
"version_map_memory": "81.9kb",
"version_map_memory_in_bytes": 83886,
"fixed_bit_set": "72kb",
"fixed_bit_set_memory_in_bytes": 73728,
"max_unsafe_auto_id_timestamp": 1514246409906,
"file_sizes": {}
}
},
"nodes": {
"count": {
"total": 2,
"data": 2,
"coordinating_only": 0,
"master": 2,
"ingest": 2
},
"versions": [
"5.5.2"
],
"os": {
"available_processors": 16,
"allocated_processors": 16,
"names": [
{
"name": "Linux",
"count": 2
}
],
"mem": {
"total": "119.9gb",
"total_in_bytes": 128781852672,
"free": "773.6mb",
"free_in_bytes": 811261952,
"used": "119.1gb",
"used_in_bytes": 127970590720,
"free_percent": 1,
"used_percent": 99
}
},
"process": {
"cpu": {
"percent": 5
},
"open_file_descriptors": {
"min": 740,
"max": 758,
"avg": 749
}
},
"jvm": {
"max_uptime": "15.6h",
"max_uptime_in_millis": 56253923,
"versions": [
{
"version": "1.8.0_144",
"vm_name": "Java HotSpot(TM) 64-Bit Server VM",
"vm_version": "25.144-b01",
"vm_vendor": "Oracle Corporation",
"count": 2
}
],
"mem": {
"heap_used": "25.5gb",
"heap_used_in_bytes": 27453966464,
"heap_max": "63.8gb",
"heap_max_in_bytes": 68580016128
},
"threads": 230
},
"fs": {
"total": "3.4tb",
"total_in_bytes": 3740103417856,
"free": "2.7tb",
"free_in_bytes": 2969643327488,
"available": "2.5tb",
"available_in_bytes": 2779609776128
},
"plugins": [
{
"name": "ingest-geoip",
"version": "5.5.2",
"description": "Ingest processor that uses looksup geo data based on ip adresses using the Maxmind geo database",
"classname": "org.elasticsearch.ingest.geoip.IngestGeoIpPlugin",
"has_native_controller": false
},
{
"name": "repository-s3",
"version": "5.5.2",
"description": "The S3 repository plugin adds S3 repositories",
"classname": "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
"has_native_controller": false
},
{
"name": "x-pack",
"version": "5.5.2",
"description": "Elasticsearch Expanded Pack Plugin",
"classname": "org.elasticsearch.xpack.XPackPlugin",
"has_native_controller": true
}
],
"network_types": {
"transport_types": {
"security4": 2
},
"http_types": {
"security4": 2
}
}
}
}

Christian_Dahlqvist · December 26, 2017, 8:52am

It doesn't look like you have a crazy amount of shards, which can often cause these kind of problems. How many shards are still UNASSIGNED?

Is there anything in the logs?

Do you have minimum_master_nodes set to 2?

Serg · December 26, 2017, 9:09am

curl -s http://localhost:9200/_cat/shards | grep UNASS | wc -l
169

GET /_nodes/_master
{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "prod",
"nodes": {
"dEcFJ6h_QfyKNr_N7362QQ": {
"name": "dEcFJ6h",

Christian_Dahlqvist · December 26, 2017, 9:09am

@radi Please open a separate thread for your completely unrelated question.

Christian_Dahlqvist · December 26, 2017, 9:10am

Can you please check this in your elasticsearch.yml config file?

Serg · December 26, 2017, 9:12am

No i dont have

root@04elasticsearch:/home/ubuntu# cat /etc/elasticsearch/elasticsearch.yml | grep -i master

Serg · December 26, 2017, 9:13am

cluster.name: prod
path.data: /vol
path.logs: /vol/logs
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: ["10......","10......"]
xpack:
security:
authc:
realms:
native1:
type: native
order: 0

All my conf

Christian_Dahlqvist · December 26, 2017, 9:18am

If you have 2 (or 3) master eligible nodes you need to set minimum_master_nodes to 2 in order to avoid split-brain scenarios. This means that the cluster will go red as soon as one node is missing, which is the correct behaviour in order to prevent data loss. If you need the cluster to be able to operate with one node down or unavailable, you need a minimum of 3 master eligible modes (which allows a majority of modes to elect a master even with one mode missing).

You might therefore be having a split-brain scenario, preventing the shards to be found and allocated.

Fix this and see if that allows the shards to get allocated.

Serg · December 26, 2017, 9:32am

PUT _cluster/settings
{
"transient": {
"discovery.zen.minimum_master_nodes": 2
}
}

{
"acknowledged": true,
"persistent": {},
"transient": {
"discovery": {
"zen": {
"minimum_master_nodes": "2"
}
}
}
}

But got the same result.

I found that some index have size parameter, but someone no.

logstash-2017.12.08 2 p UNASSIGNED

Index which had size parameter i've tried to do reindex and looks like that helped.
But all others dont want to work at that way

ex.

POST _reindex
{
"source": {
"index": "logstash-2017.12.08"
},
"dest": {
"index": "logstash-2017.12.33"
}
}

{
"error": {
"root_cause": ,
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards":
},
"status": 503
}

Christian_Dahlqvist · December 26, 2017, 9:35am

What is the output of the cat nodes API?

Serg · December 26, 2017, 9:36am

ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.3.2.39 9 98 6 0.07 0.30 0.81 mdi - dEcFJ6h
10.3.2.45 9 95 6 0.07 0.34 0.65 mdi * 7vuMfym

Christian_Dahlqvist · December 26, 2017, 9:39am

Not sure I understöd what you mean...

Now that both nodes are part of the same cluster, are shards getting allocated in the background? Do you see anything in the logs?

Serg · December 26, 2017, 9:42am

normal shards have a column with size

logstash-2017.11.14 0 p STARTED 3516457 3.2gb 10.3.2.45 7vuMfym

but UNASSIGNED dont have

logstash-2017.12.17 1 p UNASSIGNED

Topic		Replies	Views
Shard Allocation Failed - Even Manually Elasticsearch	4	2267	September 15, 2020
Unassigned shards Elasticsearch	3	442	July 6, 2017
Reassigning Shards Elasticsearch	16	8712	July 5, 2017
All shards unassigned after migrating to 5.x (5.1.2) Elasticsearch	1	399	February 28, 2017
All new indexes created have unassigned shards Elasticsearch	10	7805	July 5, 2017

Could not reassign UNASSIGNED shards (elasticsearch 5.6 )

Related topics