Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2000]/[2000] maximum shards open

I'm using Elastic 7.6.2
I'm creating an index using a pipeline, Reading and sending text file through the filebeat and sending to a pipeline.
I'm getting this error : Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2000]/[2000] maximum shards open

Previously it was working fine, I had created an index from pipeline and able to feed my test data via filebeat and successfully sent those events to the logstash. Since all data was test data so I deleted my index and indices and now with live data when I am trying to import in the same way then getting this cluster error and not able to create index.

Can someone please help!

Also for information I don't have admin rights to increase the shared/cluster count.

1 Like

Welcome to our community! :smiley:
Please don't post pictures of text, they are difficult to read, impossible to search and replicate (if it's code), and some people may not be even able to see them :slight_smile:

What is the output from the _cluster/stats API endpoint?

2 Likes

Thank You.
Yes, I'll take care of not to post picture of text but that a picture I tried to post from the logstash error logs I thought that would help for more understanding. :slight_smile:

I'm posting here again the text version of the above error image.

[2020-11-30T15:26:07,238][WARN ][logstash.outputs.elasticsearch][elastic_08] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"mtbc-logs-test1", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0xa5cf853>], :response=>{"index"=>{"_index"=>"mtbc-logs-test1", "_type"=>"_doc", "_id"=>nil, "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [2000]/[2000] maximum shards open;"}}}}

Output of _cluster/stats is :

{
  "_nodes" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "cluster_name" : "****",
  "cluster_uuid" : "*****",
  "timestamp" : 1606896583104,
  "status" : "green",
  "indices" : {
    "count" : 1004,
    "shards" : {
      "total" : 2008,
      "primaries" : 1004,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 2,
          "avg" : 2.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 1.0,
          "max" : 1.0,
          "avg" : 1.0
        }
      }
    },
    "docs" : {
      "count" : 3569786785,
      "deleted" : 2020705
    },
    "store" : {
      "size_in_bytes" : 1824465174551
    },
    "fielddata" : {
      "memory_size_in_bytes" : 5704156256,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 968114487,
      "total_count" : 32361946,
      "hit_count" : 46660,
      "miss_count" : 32315286,
      "cache_size" : 5573,
      "cache_count" : 7708,
      "evictions" : 2135
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 16050,
      "memory_in_bytes" : 2634937640,
      "terms_memory_in_bytes" : 1959526868,
      "stored_fields_memory_in_bytes" : 601939968,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 14103296,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 59367508,
      "index_writer_memory_in_bytes" : 274806332,
      "version_map_memory_in_bytes" : 100662769,
      "fixed_bit_set_memory_in_bytes" : 3950104,
      "max_unsafe_auto_id_timestamp" : 1606742163681,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 5,
      "coordinating_only" : 0,
      "data" : 2,
      "ingest" : 2,
      "master" : 3,
      "ml" : 0,
      "voting_only" : 1
    },
    "versions" : [
      "7.6.2"
    ],
    "os" : {
      "available_processors" : 80,
      "allocated_processors" : 22,
      "names" : [
        {
          "name" : "Linux",
          "count" : 5
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 7 (Core)",
          "count" : 5
        }
      ],
      "mem" : {
        "total_in_bytes" : 675833057280,
        "free_in_bytes" : 52869578752,
        "used_in_bytes" : 622963478528,
        "free_percent" : 8,
        "used_percent" : 92
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 3
      },
      "open_file_descriptors" : {
        "min" : 386,
        "max" : 14222,
        "avg" : 5682
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 11627073237,
      "versions" : [
        {
          "version" : "13.0.2",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "13.0.2+8",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 5
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 24404211656,
        "heap_max_in_bytes" : 35678060544
      },
      "threads" : 760
    },
    "fs" : {
      "total_in_bytes" : 2203318222848,
      "free_in_bytes" : 373114109952,
      "available_in_bytes" : 373114109952
    },
    "plugins" : [
      {
        "name" : "repository-s3",
        "version" : "7.6.2",
        "elasticsearch_version" : "7.6.2",
        "java_version" : "1.8",
        "description" : "The S3 repository plugin adds S3 repositories",
        "classname" : "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      }
    ],
    "network_types" : {
      "transport_types" : {
        "security4" : 5
      },
      "http_types" : {
        "security4" : 5
      }
    },
    "discovery_types" : {
      "zen" : 5
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "tar",
        "count" : 5
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 5,
      "processor_stats" : {
        "geoip" : {
          "count" : 39904386,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 75782
        },
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "pipeline" : {
          "count" : 79808772,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 93480
        },
        "script" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "user_agent" : {
          "count" : 39904386,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 11270
        }
      }
    }
  }
}

Thanks.
The TLDR is that you probably have too many shards, it looks like your average shard size is only 1.7GB, which is pretty small.

You should look at the _shrink API, and perhaps alter your index sharding strategy.

Thanks for the reply.
Instead of using _Shrink API, Can we increase our cluster size by using below code ?
PUT /_cluster/settings
{
"persistent": {
"cluster.max_shards_per_node": "3000"
}
}

Technically you can override this limit, but please be aware that it is there for a very good reason. Having lots of small shards is very inefficient and I have seen numerous clusters where high shard count has caused performance as well as stability problems. These problems do not necessarily manifest themselves immediately but generally gradually degrade the cluster over time. At some point you may find yourself in a situation where the cluster is no longer operable and you could lose data. At that point it is often quite painful to change your sharding strategy and reindex data into larger indices with fewer shards, so I would recommend you start doing this now instead of applying this bandaid and kicking the can down the road.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.