Validation_exception: Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2000]/[2000] maximum shards open;

Hi everyone. I know this is a similar topic to others, but so far it seems odd to me still. I have the message above on my instance, however when I query for _cluster/stats I have this:

"cluster_name" : "docker-cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 1031,
  "active_shards" : 1031,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 969,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 51.55,

which is odd, since I have a lot of unassigned shards and still have the error. What am I missing here ? And how to solve this ?

Thanks in advance

Welcome to our community! :smiley:

TLDR it factors in replicas with this count. However you have a tonne of shards for a single node cluster, and that is not ideal.
What is the output from the _cluster/stats?pretty&human API?

Tks for the welcome and answer :slight_smile:

Here is the output:

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "AH7tF3gxSbS6XGEjnbAO4A",
  "timestamp" : 1631006769431,
  "status" : "yellow",
  "indices" : {
    "count" : 866,
    "shards" : {
      "total" : 894,
      "primaries" : 894,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 5,
          "avg" : 1.0323325635103926
        },
        "primaries" : {
          "min" : 1,
          "max" : 5,
          "avg" : 1.0323325635103926
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 3773329721,
      "deleted" : 51770743
    },
    "store" : {
      "size" : "603.5gb",
      "size_in_bytes" : 648062733766,
      "reserved" : "0b",
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size" : "83.8mb",
      "memory_size_in_bytes" : 87957608,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size" : "274.6mb",
      "memory_size_in_bytes" : 288026816,
      "total_count" : 72939691617,
      "hit_count" : 888282270,
      "miss_count" : 72051409347,
      "cache_size" : 45969,
      "cache_count" : 2650898,
      "evictions" : 2604929
    },
    "completion" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 6400,
      "memory" : "129.9mb",
      "memory_in_bytes" : 136257004,
      "terms_memory" : "58.1mb",
      "terms_memory_in_bytes" : 60989552,
      "stored_fields_memory" : "40.2mb",
      "stored_fields_memory_in_bytes" : 42200624,
      "term_vectors_memory" : "0b",
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory" : "7.2mb",
      "norms_memory_in_bytes" : 7621184,
      "points_memory" : "0b",
      "points_memory_in_bytes" : 0,
      "doc_values_memory" : "24.2mb",
      "doc_values_memory_in_bytes" : 25445644,
      "index_writer_memory" : "714.6mb",
      "index_writer_memory_in_bytes" : 749317452,
      "version_map_memory" : "54.7mb",
      "version_map_memory_in_bytes" : 57403003,
      "fixed_bit_set" : "9.8mb",
      "fixed_bit_set_memory_in_bytes" : 10286424,
      "max_unsafe_auto_id_timestamp" : 1630943836323,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "binary",
          "count" : 15,
          "index_count" : 4
        },
        {
          "name" : "boolean",
          "count" : 3050,
          "index_count" : 434
        },
        {
          "name" : "byte",
          "count" : 13,
          "index_count" : 13
        },
        {
          "name" : "date",
          "count" : 3687,
          "index_count" : 718
        },
        {
          "name" : "date_nanos",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "date_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "double",
          "count" : 8,
          "index_count" : 8
        },
        {
          "name" : "double_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "flattened",
          "count" : 9,
          "index_count" : 1
        },
        {
          "name" : "float",
          "count" : 4459,
          "index_count" : 162
        },
        {
          "name" : "float_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "geo_point",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "geo_shape",
          "count" : 3,
          "index_count" : 3
        },
        {
          "name" : "half_float",
          "count" : 86,
          "index_count" : 24
        },
        {
          "name" : "integer",
          "count" : 1100,
          "index_count" : 316
        },
        {
          "name" : "integer_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "ip",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "ip_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "keyword",
          "count" : 12719,
          "index_count" : 710
        },
        {
          "name" : "long",
          "count" : 9965,
          "index_count" : 626
        },
        {
          "name" : "long_range",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "nested",
          "count" : 120,
          "index_count" : 75
        },
        {
          "name" : "object",
          "count" : 6851,
          "index_count" : 601
        },
        {
          "name" : "scaled_float",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "shape",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "short",
          "count" : 46,
          "index_count" : 20
        },
        {
          "name" : "text",
          "count" : 12449,
          "index_count" : 692
        }
      ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [
        {
          "name" : "pattern_capture",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "analyzer_types" : [
        {
          "name" : "custom",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [
        {
          "name" : "uax_url_email",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_filters" : [
        {
          "name" : "lowercase",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "unique",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_analyzers" : [
        {
          "name" : "whitespace",
          "count" : 1705,
          "index_count" : 349
        }
      ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 1,
      "coordinating_only" : 0,
      "data" : 1,
      "data_cold" : 1,
      "data_content" : 1,
      "data_hot" : 1,
      "data_warm" : 1,
      "ingest" : 1,
      "master" : 1,
      "ml" : 1,
      "remote_cluster_client" : 1,
      "transform" : 1,
      "voting_only" : 0
    },
    "versions" : [
      "7.10.1"
    ],
    "os" : {
      "available_processors" : 16,
      "allocated_processors" : 16,
      "names" : [
        {
          "name" : "Linux",
          "count" : 1
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 8 (Core)",
          "count" : 1
        }
      ],
      "mem" : {
        "total" : "31.4gb",
        "total_in_bytes" : 33728368640,
        "free" : "1gb",
        "free_in_bytes" : 1091543040,
        "used" : "30.3gb",
        "used_in_bytes" : 32636825600,
        "free_percent" : 3,
        "used_percent" : 97
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 4
      },
      "open_file_descriptors" : {
        "min" : 7401,
        "max" : 7401,
        "avg" : 7401
      }
    },
    "jvm" : {
      "max_uptime" : "98.6d",
      "max_uptime_in_millis" : 8525973712,
      "versions" : [
        {
          "version" : "15.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15.0.1+9",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 1
        }
      ],
      "mem" : {
        "heap_used" : "6gb",
        "heap_used_in_bytes" : 6542830384,
        "heap_max" : "16gb",
        "heap_max_in_bytes" : 17179869184
      },
      "threads" : 170
    },
    "fs" : {
      "total" : "1006.9gb",
      "total_in_bytes" : 1081180868608,
      "free" : "263.5gb",
      "free_in_bytes" : 282967248896,
      "available" : "212.3gb",
      "available_in_bytes" : 227974893568
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "security4" : 1
      },
      "http_types" : {
        "security4" : 1
      }
    },
    "discovery_types" : {
      "single-node" : 1
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "docker",
        "count" : 1
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 3,
      "processor_stats" : {
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "rename" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "script" : {
          "count" : 62627,
          "failed" : 0,
          "current" : 0,
          "time" : "825ms",
          "time_in_millis" : 825
        },
        "set" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        }
      }
    }
  }
}

Ok, you should look at reducing your shard count and definitely upgrade :slight_smile:

Can you post part of your _cat/indices?v output? It'll help provide suggestions on the best way to do that

Ok, here it is (and thx a lot for the help again :slight_smile: )

health status index                                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   *****-builds-000059                       NffkD6LmS8yUYovgKOZ2Bg   1   1    1668858            0      1.2gb          1.2gb
yellow open   *****-builds-000058                       8UK97_OeRKa17d0-g1LQyw   1   1     211013            0    161.4mb        161.4mb
yellow open   *****-builds-000062                       iFqAUoLoSOe1JJ1eU12WbQ   1   1     125022            0     79.8mb         79.8mb
yellow open   *****-builds-000061                       R2l4xT--SdSTA0Boa7DYYQ   1   1     264761            0    205.2mb        205.2mb
yellow open   *****-builds-000060                       UKdJCKU_T1y76czSjM80Kg   1   1     243234            0    185.5mb        185.5mb
yellow open   *****-builds-000066                       kMoEnJcAREqMvqfYeb5B2w   1   1          0            0       208b           208b
yellow open   *****-builds-000065                       Me10WZbbRASIBJjxbjbOtg   1   1          0            0       208b           208b
yellow open   *****-builds-000064                       PuGi8I4qQPmVDSntdQ15rw   1   1          0            0       208b           208b
yellow open   *****-builds-000063                       SG-yq2cUQgSwT21VJAA2kg   1   1          0            0       208b           208b
yellow open   *******************-000001                 8fVcwz2kREa4clFoV-1gQw   1   1    7211842      1006798    774.2mb        774.2mb
yellow open   *******************-000004                 n92nhAYgRl-_vLA6WQNBYw   1   1     620902            0     60.1mb         60.1mb
yellow open   *******************-000005                 NWS6P-32Q-GyYW9RLXWGIw   1   1     426093            0     40.9mb         40.9mb
yellow open   *******************-000002                 8bnrxrK_S9yaUXHHem4vmA   1   1     445220            0     43.5mb         43.5mb
yellow open   *******************-000003                 8cBvG90LTk24SDwBNO_NAQ   1   1     421129            0     41.6mb         41.6mb
yellow open   *******************-000008                 R1kE3v94Rwy9hStYJrCdtw   1   1     150626            0     23.5mb         23.5mb
yellow open   *******************-000006                 0wv25G4TQ2uMqK22hwIUuQ   1   1     465199            0     45.3mb         45.3mb
yellow open   *******************-000007                 jmfdlcKMQ_KfCuwrd2OL-Q   1   1     617892            0     60.2mb         60.2mb
yellow open   *******************-000009                 FWN4v6brS0-2vUCaUHHgmA   1   1    4458608            0    678.8mb        678.8mb
green  open   .reporting-2020.03.15                      hAEHm9xbQuOPBfSLhqyyWg   1   0         11            3    469.9kb        469.9kb
yellow open   *******************-000008                 pK8wMHhnQx-3Kcd8y5xXGw   1   1   13293443            0      1.9gb          1.9gb
yellow open   *******************-000007                 JeyMzn4ZT8-dVqpr6b8WCQ   1   1   10029951            0      1.4gb          1.4gb
yellow open   *******************-000006                 aCdv1u-VSoizKvaovgshKw   1   1    9728344            0      1.3gb          1.3gb
yellow open   *******************-000005                 JM9MtkSNS8ei52UbWuToIw   1   1   13087064            0      1.8gb          1.8gb
yellow open   *******************-000004                 8Cx96f8jS123EwrZvtqXQA   1   1   10472533            0      1.4gb          1.4gb
yellow open   *******************-000003                 TC69Tk3KTBONJ1Ep5vAkCg   1   1   10442301            0      1.5gb          1.5gb
yellow open   *******************-000002                 eH6GNwHbSLi6DHkD6WWSQg   1   1   11216771            0      1.6gb          1.6gb
green  open   .apm-agent-configuration                   q0DNkG72TrKE_9OzEOriVg   1   0          0            0       283b           283b
yellow open   metricbeat-7.10.1-2021.09.06-000008        qc0uKBI-Scmcy3o8dqPJog   1   1      15540            0     13.4mb         13.4mb


Thanks. Those are super small indices!

What does your ILM policy look like?

There are some others that are bigger, like

yellow open   *****************-000007                   YmIjhgjjRciN7XF4cWog6A   1   1    8095771            0      3.8gb          3.8gb
yellow open   *****************-000008                   axSFyDLTRbCAQyVBOUj9ng   1   1    8716290            0      4.2gb          4.2gb
yellow open   *****************-000005                   1-zbELv8RFeEMyS1Dl6VHA   1   1    6225668            0      2.9gb          2.9gb
yellow open   *****************-000006                   X0Z_qa2SRJi0s314Ay_bEA   1   1    7193246            0      3.4gb          3.4gb
yellow open   *****************-000009                   geuJ6BCiTw-rth3Xf153KA   1   1    7849664            0      3.7gb          3.7gb
yellow open   *****************-000003                   kglfYi0qSWKKtv-V4rGBcw   1   1    8277215            0        4gb            4gb
yellow open   *****************-000004                   N14UHSErTKSJkIU_ZPbLoQ   1   1    7868410            0      3.7gb          3.7gb
yellow open   *****************-000001                   a_0quyOkTwCOaywNQvCX8g   1   1    8143804            0      3.8gb          3.8gb
yellow open   *****************-000002                   gcNF0Y5kSM6WpgWpR-zGLA   1   1    6527464            0      3.1gb          3.1gb
yellow open   *****************-000010                   pj8L-M25Tb-0_HTX3bQm-Q   1   1    6656933            0      3.2gb          3.2gb
yellow open   *****************-000011                   jz5SbJpwTCiHkJM4XBmRjQ   1   1    7885603            0      3.8gb          3.8gb
yellow open   *****************-000014                   jNu3QEeiQOCGcPuprmBapw   1   1    7820316            0      3.7gb          3.7gb
yellow open   *****************-000015                   Gg_zeGVlTvG7CUNiO1qiDQ   1   1    6105980            0      2.9gb          2.9gb
yellow open   *****************-000012                   u5fB9sYKTd2tS4F3cUqSjg   1   1    6490136            0      3.1gb          3.1gb
yellow open   *****************-000013                   9VbUX2TtQM65C0rgYgC1HQ   1   1    6947566            0      3.3gb          3.3gb

Still not as large I suppose :slight_smile:

ILM policy is pretty much the same for all indices:

    "policy" : {
      "phases" : {
        "hot" : {
          "min_age" : "0ms",
          "actions" : {
            "rollover" : {
              "max_size" : "10gb",
              "max_age" : "7d"
            }
          }
        },
        "delete" : {
          "min_age" : "180d",
          "actions" : {
            "delete" : {
              "delete_searchable_snapshot" : true
            }
          }
        }
      }
    }
  }
}

You should definitely increase your max_size - we generally recommend shard sizes of 30 to 50GB.
You could also increase your max_age to 1-2 months or something similar, to improve the size.