Error No index created - reason"=>"Validation Failed: 1: this action would add [2] shards

ehmontesinos · August 29, 2022, 9:33am

Good morning,

I have an ELK environment installed in version 8.1. Within this environment, I have configured several pipelines that collect daily csv files to process them and convert them into indexes in Elasticsearch to later be able to create dashboards with that information.

We have implemented this configuration for some months now and we have daily the information collected from those files that are passed to the logstash.

The problem has come now that after a week, reviewing, we see that the indexes have not been created and therefore we do not have information in the dashboards.

Reviewing the logs, we have detected the following problem in the logstash log

Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"nextcloud-usage-2022.08.29", :routing=>nil}, {"date"=> "2022-08-29 09:07:56", "assigned_quota"=>26843545600, "event"=>{"original"=>"ajimmor_9371,2022-08-29 09:07:56,1970-01-01 00:00:00,26843545600,,,,,"}, "last_login_date"=>1970-01-01T00:00:00Z, "number_uploads"=>nil, "user-id"=>"ajimmor_9371", "host "=>{"name"=>"tfvs0756"}, "number_shares"=>nil, "number_files"=>nil, "used_quota"=>nil, "log"=>{"file"=>{"path" =>"/etc/logstash/conf.d/nextcloud-usage-reports-daily/usage-report_20220829100747.csv"}}, "@version"=>"1", "@timestamp"=>2022-08-29T08 :07:56Z, "number_downloads"=>nil}], :response=>{"index"=>{"_index"=>"nextcloud-usage-2022.08.29", "_id"=>nil, "status" =>400, "error"=>{"type"=>"validation_exception", "reason"=>"Validation Failed: 1: this action would add [2] shards, but this cluster currently has [999]/[1000 ] maximum normal shards open;"}}}}

We do not have any cluster implemented, it is only a single node. What does it mean that I have almost the maximum number of open shards? How can it be configured correctly so that within 3-4 months this problem does not occur again?

Thank you very much in advance,

warkolm · August 30, 2022, 4:43am

Take a look at ILM to handle this.

You can increase the shard limit, but you should look to generally reduce the amount to make your cluster efficient.

ehmontesinos · August 30, 2022, 6:37am

Thank you very much for the reply.

Now I have other doubts that I don't know how to resolve since I'm new to this architecture.

The first thing, in our cluster (which only has one node) we want to keep all the indices or the data, that is, in 2024 for example, we can obtain data from 2021. If I apply the ILM, what it does is eliminate indices with an age date X as you configure it, correct? And I think that the ILM is an option that does not come by default in the ELK, since it asks me to request a trial license.

I have read in many references the possibility of increasing the number of shards, from 1000 which is the default number that ELK has configured. My question is, how do you configure that value?

In the case of not applying an ILM, I understand that the best solution is to add nodes to the cluster as needed. Is this correct too?

Best regards.

warkolm · August 30, 2022, 6:51am

ILM is free functionality, you don't need anything other than the license that gets automatically generated when you first start Elasticsearch.
It works on the age of the index itself, not on the data in it.

Total shards per node | Elasticsearch Guide [8.4] | Elastic is how you can change it. You can add nodes to reduce the per-node count, yes. But that might not be effective.

What is the output from the _cluster/stats?pretty&human API?

Christian_Dahlqvist · August 30, 2022, 7:01am

Having lots of small shards is inefficient, both from performance and resource usage perspectives. The latest version of Elasticsearch has reduced the overhead per shard, so I woulkd recommend youy upgrade. If you are going to keep data around for a long time I would also recommend changing how you index your data. Change to monthly indices instead of daily ones so you end up with fewer larger shards. If you are using rollover, change the maximum time period each index covers in the ILM policy. You can also increase the shard limit per node, but doing so without addressing the other things I mentioned may cause stability issues down the line that may at that point be difficult to fix without losing data.

ehmontesinos · August 30, 2022, 8:22am

Thanks again.

Here the output of the api request

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "74GuC05qS1maX3BEiKrmLg",
  "timestamp" : 1661847464931,
  "status" : "yellow",
  "indices" : {
    "count" : 510,
    "shards" : {
      "total" : 510,
      "primaries" : 510,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 55870389,
      "deleted" : 62145
    },
    "store" : {
      "size_in_bytes" : 25720014582,
      "total_data_set_size_in_bytes" : 25720014582,
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size_in_bytes" : 0,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 7896,
      "total_count" : 2650679,
      "hit_count" : 84284,
      "miss_count" : 2566395,
      "cache_size" : 1,
      "cache_count" : 1940,
      "evictions" : 1939
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 3783,
      "memory_in_bytes" : 0,
      "terms_memory_in_bytes" : 0,
      "stored_fields_memory_in_bytes" : 0,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 0,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 0,
      "index_writer_memory_in_bytes" : 211366,
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set_memory_in_bytes" : 8880,
      "max_unsafe_auto_id_timestamp" : 1655974990834,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "boolean",
          "count" : 6,
          "index_count" : 5,
          "script_count" : 0
        },
        {
          "name" : "constant_keyword",
          "count" : 3,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "date",
          "count" : 586,
          "index_count" : 486,
          "script_count" : 0
        },
        {
          "name" : "float",
          "count" : 10,
          "index_count" : 4,
          "script_count" : 0
        },
        {
          "name" : "geo_point",
          "count" : 2,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "integer",
          "count" : 2,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "ip",
          "count" : 1,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "keyword",
          "count" : 14481,
          "index_count" : 486,
          "script_count" : 0
        },
        {
          "name" : "long",
          "count" : 1000,
          "index_count" : 484,
          "script_count" : 0
        },
        {
          "name" : "nested",
          "count" : 3,
          "index_count" : 3,
          "script_count" : 0
        },
        {
          "name" : "object",
          "count" : 3178,
          "index_count" : 485,
          "script_count" : 0
        },
        {
          "name" : "text",
          "count" : 14300,
          "index_count" : 485,
          "script_count" : 0
        },
        {
          "name" : "version",
          "count" : 3,
          "index_count" : 3,
          "script_count" : 0
        }
      ],
      "runtime_field_types" : [ ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [ ],
      "analyzer_types" : [ ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [ ],
      "built_in_filters" : [ ],
      "built_in_analyzers" : [ ]
    },
    "versions" : [
      {
        "version" : "8.1.0",
        "index_count" : 4,
        "primary_shard_count" : 4,
        "total_primary_bytes" : 11403028
      },
      {
        "version" : "8.1.1",
        "index_count" : 186,
        "primary_shard_count" : 186,
        "total_primary_bytes" : 6959446123
      },
      {
        "version" : "8.1.2",
        "index_count" : 320,
        "primary_shard_count" : 320,
        "total_primary_bytes" : 18749165431
      }
    ]
  },
  "nodes" : {
    "count" : {
      "total" : 1,
      "coordinating_only" : 0,
      "data" : 1,
      "data_cold" : 1,
      "data_content" : 1,
      "data_frozen" : 1,
      "data_hot" : 1,
      "data_warm" : 1,
      "ingest" : 1,
      "master" : 1,
      "ml" : 1,
      "remote_cluster_client" : 1,
      "transform" : 1,
      "voting_only" : 0
    },
    "versions" : [
      "8.1.2"
    ],
    "os" : {
      "available_processors" : 8,
      "allocated_processors" : 8,
      "names" : [
        {
          "name" : "Linux",
          "count" : 1
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Ubuntu 20.04.4 LTS",
          "count" : 1
        }
      ],
      "architectures" : [
        {
          "arch" : "amd64",
          "count" : 1
        }
      ],
      "mem" : {
        "total_in_bytes" : 33670172672,
        "adjusted_total_in_bytes" : 33670172672,
        "free_in_bytes" : 5439082496,
        "used_in_bytes" : 28231090176,
        "free_percent" : 16,
        "used_percent" : 84
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 0
      },
      "open_file_descriptors" : {
        "min" : 2956,
        "max" : 2956,
        "avg" : 2956
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 71274818,
      "versions" : [
        {
          "version" : "17.0.2",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "17.0.2+8",
          "vm_vendor" : "Eclipse Adoptium",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 1
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 1545081296,
        "heap_max_in_bytes" : 16835936256
      },
      "threads" : 92
    },
    "fs" : {
      "total_in_bytes" : 1073214390272,
      "free_in_bytes" : 1047329243136,
      "available_in_bytes" : 1047329243136
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "netty4" : 1
      },
      "http_types" : {
        "netty4" : 1
      }
    },
    "discovery_types" : {
      "multi-node" : 1
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "deb",
        "count" : 1
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 0,
      "processor_stats" : { }
    },
    "indexing_pressure" : {
      "memory" : {
        "current" : {
          "combined_coordinating_and_primary_in_bytes" : 0,
          "coordinating_in_bytes" : 0,
          "primary_in_bytes" : 0,
          "replica_in_bytes" : 0,
          "all_in_bytes" : 0
        },
        "total" : {
          "combined_coordinating_and_primary_in_bytes" : 0,
          "coordinating_in_bytes" : 0,
          "primary_in_bytes" : 0,
          "replica_in_bytes" : 0,
          "all_in_bytes" : 0,
          "coordinating_rejections" : 0,
          "primary_rejections" : 0,
          "replica_rejections" : 0
        },
        "limit_in_bytes" : 0
      }
    }
  }
}

When I acces to ILM, I see these notification

Thanks again.

ehmontesinos · August 30, 2022, 8:25am

Thanks for the reply.

The option of changing the daily indices for monthly indices is not a solution, since I have a process that generates daily csv files that have to be processed in order to later be able to see the graphs generated in the dashboards that we have implemented, and the client always accesses daily to see how that data has changed and make reports based on the graphs.

Christian_Dahlqvist · August 30, 2022, 8:39am

If the number of indices and shards are going to continue to be small and grow in number over time you are IMHO at some point going to end up in trouble.

I do not understand why this requires daily indices. I would still recommend changing the process/dashboards so they work with non-daily indices.

system · September 27, 2022, 8:40am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can't create indices Logstash	7	11748	December 2, 2020
Index isn’t created Logstash	7	612	March 7, 2017
Logstash does not create index and doesn't give out any errors. Help Logstash	11	5573	July 6, 2017
No indexes created in Elasticsearch Elasticsearch	14	2015	February 11, 2019
Elasticsearch not creating the Index for new pipeline via logstash Elasticsearch	2	683	February 10, 2019

Error No index created - reason"=>"Validation Failed: 1: this action would add [2] shards

Related topics