Error No index created - reason"=>"Validation Failed: 1: this action would add [2] shards

Good morning,

I have an ELK environment installed in version 8.1. Within this environment, I have configured several pipelines that collect daily csv files to process them and convert them into indexes in Elasticsearch to later be able to create dashboards with that information.

We have implemented this configuration for some months now and we have daily the information collected from those files that are passed to the logstash.

The problem has come now that after a week, reviewing, we see that the indexes have not been created and therefore we do not have information in the dashboards.

Reviewing the logs, we have detected the following problem in the logstash log

Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"nextcloud-usage-2022.08.29", :routing=>nil}, {"date"=> "2022-08-29 09:07:56", "assigned_quota"=>26843545600, "event"=>{"original"=>"ajimmor_9371,2022-08-29 09:07:56,1970-01-01 00:00:00,26843545600,,,,,"}, "last_login_date"=>1970-01-01T00:00:00Z, "number_uploads"=>nil, "user-id"=>"ajimmor_9371", "host "=>{"name"=>"tfvs0756"}, "number_shares"=>nil, "number_files"=>nil, "used_quota"=>nil, "log"=>{"file"=>{"path" =>"/etc/logstash/conf.d/nextcloud-usage-reports-daily/usage-report_20220829100747.csv"}}, "@version"=>"1", "@timestamp"=>2022-08-29T08 :07:56Z, "number_downloads"=>nil}], :response=>{"index"=>{"_index"=>"nextcloud-usage-2022.08.29", "_id"=>nil, "status" =>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] shards, but this cluster currently has [999]/[1000 ] maximum normal shards open;"}}}}

We do not have any cluster implemented, it is only a single node. What does it mean that I have almost the maximum number of open shards? How can it be configured correctly so that within 3-4 months this problem does not occur again?

Thank you very much in advance,

Take a look at ILM to handle this.

You can increase the shard limit, but you should look to generally reduce the amount to make your cluster efficient.

Thank you very much for the reply.

Now I have other doubts that I don't know how to resolve since I'm new to this architecture.

The first thing, in our cluster (which only has one node) we want to keep all the indices or the data, that is, in 2024 for example, we can obtain data from 2021. If I apply the ILM, what it does is eliminate indices with an age date X as you configure it, correct? And I think that the ILM is an option that does not come by default in the ELK, since it asks me to request a trial license.

I have read in many references the possibility of increasing the number of shards, from 1000 which is the default number that ELK has configured. My question is, how do you configure that value?

In the case of not applying an ILM, I understand that the best solution is to add nodes to the cluster as needed. Is this correct too?

Best regards.

ILM is free functionality, you don't need anything other than the license that gets automatically generated when you first start Elasticsearch.
It works on the age of the index itself, not on the data in it.

Total shards per node | Elasticsearch Guide [8.4] | Elastic is how you can change it. You can add nodes to reduce the per-node count, yes. But that might not be effective.

What is the output from the _cluster/stats?pretty&human API?

Having lots of small shards is inefficient, both from performance and resource usage perspectives. The latest version of Elasticsearch has reduced the overhead per shard, so I woulkd recommend youy upgrade. If you are going to keep data around for a long time I would also recommend changing how you index your data. Change to monthly indices instead of daily ones so you end up with fewer larger shards. If you are using rollover, change the maximum time period each index covers in the ILM policy. You can also increase the shard limit per node, but doing so without addressing the other things I mentioned may cause stability issues down the line that may at that point be difficult to fix without losing data.

Thanks again.

Here the output of the api request

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "74GuC05qS1maX3BEiKrmLg",
  "timestamp" : 1661847464931,
  "status" : "yellow",
  "indices" : {
    "count" : 510,
    "shards" : {
      "total" : 510,
      "primaries" : 510,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 55870389,
      "deleted" : 62145
    },
    "store" : {
      "size_in_bytes" : 25720014582,
      "total_data_set_size_in_bytes" : 25720014582,
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size_in_bytes" : 0,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 7896,
      "total_count" : 2650679,
      "hit_count" : 84284,
      "miss_count" : 2566395,
      "cache_size" : 1,
      "cache_count" : 1940,
      "evictions" : 1939
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 3783,
      "memory_in_bytes" : 0,
      "terms_memory_in_bytes" : 0,
      "stored_fields_memory_in_bytes" : 0,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 0,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 0,
      "index_writer_memory_in_bytes" : 211366,
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set_memory_in_bytes" : 8880,
      "max_unsafe_auto_id_timestamp" : 1655974990834,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "boolean",
          "count" : 6,
          "index_count" : 5,
          "script_count" : 0
        },
        {
          "name" : "constant_keyword",
          "count" : 3,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "date",
          "count" : 586,
          "index_count" : 486,
          "script_count" : 0
        },
        {
          "name" : "float",
          "count" : 10,
          "index_count" : 4,
          "script_count" : 0
        },
        {
          "name" : "geo_point",
          "count" : 2,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "integer",
          "count" : 2,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "ip",
          "count" : 1,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "keyword",
          "count" : 14481,
          "index_count" : 486,
          "script_count" : 0
        },
        {
          "name" : "long",
          "count" : 1000,
          "index_count" : 484,
          "script_count" : 0
        },
        {
          "name" : "nested",
          "count" : 3,
          "index_count" : 3,
          "script_count" : 0
        },
        {
          "name" : "object",
          "count" : 3178,
          "index_count" : 485,
          "script_count" : 0
        },
        {
          "name" : "text",
          "count" : 14300,
          "index_count" : 485,
          "script_count" : 0
        },
        {
          "name" : "version",
          "count" : 3,
          "index_count" : 3,
          "script_count" : 0
        }
      ],
      "runtime_field_types" : [ ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [ ],
      "analyzer_types" : [ ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [ ],
      "built_in_filters" : [ ],
      "built_in_analyzers" : [ ]
    },
    "versions" : [
      {
        "version" : "8.1.0",
        "index_count" : 4,
        "primary_shard_count" : 4,
        "total_primary_bytes" : 11403028
      },
      {
        "version" : "8.1.1",
        "index_count" : 186,
        "primary_shard_count" : 186,
        "total_primary_bytes" : 6959446123
      },
      {
        "version" : "8.1.2",
        "index_count" : 320,
        "primary_shard_count" : 320,
        "total_primary_bytes" : 18749165431
      }
    ]
  },
  "nodes" : {
    "count" : {
      "total" : 1,
      "coordinating_only" : 0,
      "data" : 1,
      "data_cold" : 1,
      "data_content" : 1,
      "data_frozen" : 1,
      "data_hot" : 1,
      "data_warm" : 1,
      "ingest" : 1,
      "master" : 1,
      "ml" : 1,
      "remote_cluster_client" : 1,
      "transform" : 1,
      "voting_only" : 0
    },
    "versions" : [
      "8.1.2"
    ],
    "os" : {
      "available_processors" : 8,
      "allocated_processors" : 8,
      "names" : [
        {
          "name" : "Linux",
          "count" : 1
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Ubuntu 20.04.4 LTS",
          "count" : 1
        }
      ],
      "architectures" : [
        {
          "arch" : "amd64",
          "count" : 1
        }
      ],
      "mem" : {
        "total_in_bytes" : 33670172672,
        "adjusted_total_in_bytes" : 33670172672,
        "free_in_bytes" : 5439082496,
        "used_in_bytes" : 28231090176,
        "free_percent" : 16,
        "used_percent" : 84
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 0
      },
      "open_file_descriptors" : {
        "min" : 2956,
        "max" : 2956,
        "avg" : 2956
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 71274818,
      "versions" : [
        {
          "version" : "17.0.2",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "17.0.2+8",
          "vm_vendor" : "Eclipse Adoptium",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 1
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 1545081296,
        "heap_max_in_bytes" : 16835936256
      },
      "threads" : 92
    },
    "fs" : {
      "total_in_bytes" : 1073214390272,
      "free_in_bytes" : 1047329243136,
      "available_in_bytes" : 1047329243136
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "netty4" : 1
      },
      "http_types" : {
        "netty4" : 1
      }
    },
    "discovery_types" : {
      "multi-node" : 1
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "deb",
        "count" : 1
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 0,
      "processor_stats" : { }
    },
    "indexing_pressure" : {
      "memory" : {
        "current" : {
          "combined_coordinating_and_primary_in_bytes" : 0,
          "coordinating_in_bytes" : 0,
          "primary_in_bytes" : 0,
          "replica_in_bytes" : 0,
          "all_in_bytes" : 0
        },
        "total" : {
          "combined_coordinating_and_primary_in_bytes" : 0,
          "coordinating_in_bytes" : 0,
          "primary_in_bytes" : 0,
          "replica_in_bytes" : 0,
          "all_in_bytes" : 0,
          "coordinating_rejections" : 0,
          "primary_rejections" : 0,
          "replica_rejections" : 0
        },
        "limit_in_bytes" : 0
      }
    }
  }
}

When I acces to ILM, I see these notification

Thanks again.

Thanks for the reply.

The option of changing the daily indices for monthly indices is not a solution, since I have a process that generates daily csv files that have to be processed in order to later be able to see the graphs generated in the dashboards that we have implemented, and the client always accesses daily to see how that data has changed and make reports based on the graphs.

If the number of indices and shards are going to continue to be small and grow in number over time you are IMHO at some point going to end up in trouble.

I do not understand why this requires daily indices. I would still recommend changing the process/dashboards so they work with non-daily indices.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.