Bonjour,
Une problématique que je n'arrive pas à résoudre est actuellement en cours.
L'opération est à la base de faire deux montées de version, une première de la 5.4.2 vers la 5.6.16 avant de faire une montée de version majeure vers la 6.8.0.
Après avoir fait deux nodes, il a fallu tout stopper.
J'ai donc cherché la cause:
GET /_cluster/allocation/explain
{
"index": "siginf-cnfs-syslog-2019.01.25",
"shard": 3,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "NODE_LEFT",
"at": "2019-06-18T11:29:07.442Z",
"details": "node_left[lkwG5GSVRe2cm-5UQu73hQ]",
"last_allocation_status": "no_attempt"
},
"can_allocate": "throttled",
"allocate_explanation": "allocation temporarily throttled",
"node_allocation_decisions": [
{
"node_id": "9sh-p97VQ7WK9JMMHphL-A",
"node_name": "mutsxpidx005-docker-es-2",
"transport_address": "172.31.249.245:9300",
"node_attributes": {
"zone": "SDC3",
"ml.max_open_jobs": "10",
"ml.enabled": "true"
},
"node_decision": "throttled",
"deciders": [
{
"decider": "throttling",
"decision": "THROTTLE",
"explanation": "reached the limit of outgoing shard recoveries [10] on the node [9sh-p97VQ7WK9JMMHphL-A] which holds the primary, cluster setting [cluster.routing.allocation.node_concurrent_outgoing_recoveries=10] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
}
]
},
{
"node_id": "I6s-YOD3SEmRDO9ogU7qHg",
"node_name": "mutsxpidx005-docker-es-1",
"transport_address": "172.31.249.245:9301",
"node_attributes": {
"zone": "SDC3",
"ml.max_open_jobs": "10",
"ml.enabled": "true"
},
"node_decision": "throttled",
"deciders": [
{
"decider": "throttling",
"decision": "THROTTLE",
"explanation": "reached the limit of outgoing shard recoveries [10] on the node [I6s-YOD3SEmRDO9ogU7qHg] which holds the primary, cluster setting [cluster.routing.allocation.node_concurrent_outgoing_recoveries=10] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
}
]
},
{
"node_id": "bQylteMjSd6RvJ0ioODj6w",
"node_name": "mutsxpidx008-docker-es-1",
"transport_address": "172.31.249.250:9300",
"node_attributes": {
"zone": "SDC3"
},
{
"decider": "awareness",
"decision": "NO",
"explanation": "there are too many copies of the shard allocated to nodes with attribute [zone], there are [2] total configured shard copies for this shard id and [3] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
}
]
},
{
"node_id": "wz6Mg3p2Tzy2DDKYNkdhEw",
"node_name": "mutsxpidx004-docker-es-2",
"transport_address": "172.31.249.244:9300",
"node_attributes": {
"zone": "SDC1",
"ml.max_open_jobs": "10",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "throttling",
"decision": "THROTTLE",
"explanation": "reached the limit of outgoing shard recoveries [10] on the node [wz6Mg3p2Tzy2DDKYNkdhEw] which holds the primary, cluster setting [cluster.routing.allocation.node_concurrent_outgoing_recoveries=10] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
},
{
"decider": "awareness",
"decision": "NO",
"explanation": "there are too many copies of the shard allocated to nodes with attribute [zone], there are [2] total configured shard copies for this shard id and [3] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
}
]
}
]
}
À l'origine, les shards par défaut étaient à 5, que j'ai modifié sur le template principal à 2.
Cependant les anciens étant toujours à 5, je ne peux pas y retoucher.
J'ai recherché sur plusieurs discussions différentes sans réellement trouver de solution, l'état est du coups instable et j'aimerais principalement retrouver un état green avant de continuer la montée de version en sachant que le node master est monté en version et que je suis conscient qu'il ne pourra pas réallouer sur des versions antérieures.
Voici la conf du cluster:
{
"persistent": {
"cluster": {
"routing": {
"allocation": {
"node_concurrent_incoming_recoveries": "10",
"disk": {
"watermark": {
"low": "90%"
}
},
"node_initial_primaries_recoveries": "20",
"enable": "all",
"node_concurrent_outgoing_recoveries": "10"
}
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "500mb"
}
}
},
"transient": {
"cluster": {
"routing": {
"rebalance": {
"enable": "all"
},
"allocation": {
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "5",
"disk": {
"watermark": {
"low": "90%"
}
},
"enable": "all"
}
}
},
"discovery": {
"zen": {
"minimum_master_nodes": "3"
}
}
}
}
En vous remerciant par avance pour votre aide.
Jonathan