We have a 2-node ES cluster supporting a custom app. The nodes are as follows:
- 'hot' server w 64x CPU, 256 GB RAM, 40 TB data, and IP '1.2.3.4' (obfuscated)
- 'warm' server w 24x CPU, 128 GB RAM, 100 TB data and IP '5.6.7.8'
- both running RedHat 9 Linux
- both running ES 7.10 (yes, old version due to licensing; above my paygrade!)
We installed the Linux version of Kibana on the warm server and, in general, it works great. Unfortunately, though, whenever we perform maintenance, e.g. patch/reboot, and the cluster has to resync, it fails at 99% with one shard failing allocation -- where that shard is named "kibana_1" and is clearly the configuration data for Kibana.
I have been working-around this by deleting that shard. When I then re-open Kibana it prompts me for initial setup and works fine. That shard gets recreated (on the warm server again) and the cluster is 100% happy and 'green' status. Until I trigger a rebuild again. But is there a way to re-configure it so that it succeeds re-allocation?
Here's what the 'explain' command shows (all nodes/IP/hostnames are obfuscated):
$ curl localhost:9200/_cluster/allocation/explain?pretty
{
"index" : ".kibana_1",
"shard" : 0,
"primary" : false,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "NODE_LEFT",
"at" : "2025-06-19T22:02:35.168Z",
"details" : "node_left [def456]",
"last_allocation_status" : "no_attempt"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "def456",
"node_name" : "warm-server",
"transport_address" : "5.6.7.8:9300",
"node_attributes" : {
"box_type" : "warm"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index setting [index.routing.allocation.require] filters [box_type:\"hot\"]"
}
]
},
{
"node_id" : "abc123",
"node_name" : "hot-server",
"transport_address" : "1.2.3.4:9300",
"node_attributes" : {
"box_type" : "hot"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "a copy of this shard is already allocated to this node [[.kibana_1][0], node[abc123], [P], s[STARTED], a[id=ghi789]]"
}
]
},
{
"node_id" : "abc1234",
"node_name" : "hot-server-data",
"transport_address" : "1.2.3.4:9301",
"node_attributes" : {
"box_type" : "hot"
},
"node_decision" : "no",
"deciders" : [
{
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "a copy of this shard is already allocated to host address [5.6.7.8], on node [abc1234], and [cluster.routing.allocation.same_shard.host] is [true] which forbids more than one node on this host from holding a copy of this shard"
}
]
}
]
}
At the suggestion of other techs, I have tried updating this shard for:
- "number_of_replicas":0
- "auto_expand_replicas" : false
but neither of those changes helped.
Instead, I'm wondering if the issue is that I installed Kibana on the 'warm' server -- and now the 'explain' output appears to show that the shard expects to be on the 'hot' server? Is there a way to configure that shard to be happy on the warm server, instead? (Which it is, happy, when I re-initialize it anew.)
Any suggestions are appreciated! Thank you!