Elasticsearch status:red

Hi team and all,

Newbie to ELK and evaluating to put live soon, was all working until it stopped... no changes made just adding servers in one by one using winbeatlog direct to elasticsearch.

Can someone tell me what I need to do to resolve this in simple language please?

Bitnami ELK Stack 5.6.3 on Windows Server 2012 R2

C:\Users\administrator>curl -XGET http://localhost:9200/_cluster/health?pretty=t
"cluster_name" : "elasticsearch",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 1594,
"active_shards" : 1594,
"relocating_shards" : 0,
"initializing_shards" : 4,
"unassigned_shards" : 1943,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 206,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 742331,
"active_shards_percent_as_number" : 45.01553233549845

Hi @minidan,

Run GET _cluster/allocation/explain?pretty to see why the shards are not allocated, from then on, the resolution will become easier. :wink:

Some primary shard has been deallocated, but there are 4 shards that are being initialized

Thank you, output below, not being a db or programming guy this stuff is quite alien to me....

C:\Users\administrator>curl -XGET http://localhost:9200/cluster/allocation/expl
"index" : "winlogbeat-2017.11.15",
"shard" : 3,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"at" : "2017-12-01T13:25:05.325Z",
"last_allocation_status" : "throttled"
"can_allocate" : "throttled",
"allocate_explanation" : "allocation temporarily throttled",
"node_allocation_decisions" : [
"node_id" : "ymeD-hFgRkilIoUS3xSgBA",
"node_name" : "ymeD-hF",
"transport_address" : "",
"node_decision" : "throttled",
"store" : {
"in_sync" : true,
"allocation_id" : "WkEMU
"deciders" : [
"decider" : "throttling",
"decision" : "THROTTLE",
"explanation" : "reached the limit of ongoing initial primary recoveri
es [4], cluster setting [cluster.routing.allocation.node_initial_primaries_recov

Oh, you have many shards for a single node. I recommend that you add another node in the cluster for the shards to be redistributed.

apologies for my total ignorance, but by "node" do you mean I need to spread this load across another server now?

I sense I have a lot more learning to do......

You probably don’t need 5 shards per index.
So change the index template and use 1 shard only.

Also may be you don’t need to keep all data around but just the last 7 days?

In which case, remove old indices. You can use elasticsearch curator for that.

As already stated, you need to reduce the shard count. This is a reasonably common mistake, which is why I created a blog post containing guidance and best practices around sharding.

Hi both, unfortunately this project is to provide a central logging server with a full historical record, so I cannot trim the data down this way.

I will give the "What is a shard?" page immediately as I have lots to learn. Just to mention, the current configuration is absolutely "out of the box" so just hoping to ascertain the simplest, most reliable way forward really :slight_smile:

You can reduce the shard count by using the shrink index API to go from 5 to 1 shard per index. Given the number of indices you have and the fact that you want to keep data for a very long period, you probably need to reindex into monthly indices and/or scale out the cluster.

1 Like

okay, will have to come back to this later... I'm not even sure I can see where the "5" shards are being referenced now I'm looking back at my outputs? :face_with_raised_eyebrow:

Scaling out sounds like it would require less maintenance going forward (this needs to be as close to "set and forget" as possible, very small business with no staff to "maintain" the db going forward.


5 shards and 1 replica is the default amount set by Elasticsearch, I assume you have not changed that amount. If you have 5 shards, and 1 replica for each shard you will have 10 shards in all. So you have 10 index shards that are being stored within your node

Leaving it as it is will just make it even more difficult to fix the problems later on. You can set the number of shards to be used through an index template. The following will set it to 1 for all newly created indices (assuming you do not have any other index template that overrides it):

curl -XPUT 'localhost:9200/_template/single_primary_shard?pretty' -H 'Content-Type: application/json' -d'
  "index_patterns": ["*"],
  "settings": {
    "number_of_shards": 1

If you are using Logstash or Beats to ingest data, I would also recommend changing to weekly or monthly indices (depending on data volume). These measure will slow down the shard growth in the cluster.

Hi again, keep getting pulled away from this... hard to run anything with those variables while the whole thing is stuck. I presume needs subtly reformatting for curl under windows.

Really I'd just like to be able to delete any accrued data and indexes to date and start again at this point.

Hooray! I have working install again after some cleaning up as above... struggling a bit with something still though.

right now, I only have the winlogbeat-* template installed, and verified it can be used (200 OK)

I get the following output every time, it's not clear how I use the command above, have tried many variations. Going to investigate support contact options here at some point I think

"error": {
"root_cause": [
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: template is missing;"
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: template is missing;"
"status": 400

Is there a more general resource around here for index templates then?

I've got what I'm supposed to be doing now I think with GET / PUT from my original template, but struggling to insert the section above into my existing template without a syntax error.... MY server is still up and running 3 days in but I'd like to ensure robustness going froward.

Also not really sure how to change to weekly or monthly indices as mentioned. Again, really not coming from a dev/programmer background here so the learning curve is quite steep on assumed knowledge.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.