Elasticsearch crashed on development machine, trying to recover

We have a 3 node elasticsearch/kibana/logstash set up with docker-compose, each node has 8 GB for the Java.
I tried working with some firewall rules to secure it a bit, but found out that firewalld (centos) messes up the internal docker firewall rules.
I didn't find the problem until a bit later, and at that point it was crashing.
So now I'm trying to start the 3 nodes again, I've stopped logstash as to give the elasticsearch a bit of peace to make things align again.

I'm seeing these messages in the logfile:
elasticsearch2 | [2018-12-12T13:14:07,288][WARN ][o.e.i.c.IndicesClusterStateService] [es02] [[filebeat-6.5.1-2016.05.14][4]] marking and sending shard failed due to [failed recovery]
elasticsearch2 | org.elasticsearch.indices.recovery.RecoveryFailedException: [filebeat-6.5.1-2016.05.14][4]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{6cOtisSfRYiB42LyNWUudQ}{172.18.0.5}{172.18.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{LWBSx4qVQv2-6hJ7KrrItA}{172.18.0.4}{172.18.0.4:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}

I'm not sure if it's a message that is trying to recover, or if it's a real error.
I would hate to loose the database, but if that is the way to go, I'll have to, and then reload all the stored logfiles.

Is this message still being reported or was it a one off? Is your cluster healthy now that you have fixed the firewall? What does GET _cluster/health report? If it's all ok now then there's nothing to worry about, this sort of message is what you'd expect if nodes cannot communicate with each other for a while.

If your cluster is not currently healthy, and you need help recovering it, can you share the output of GET /_cluster/allocation/explain?

1 Like

Hi David

It has quiet down. The health for the cluster is yellow, and it has stopped syncing at 99.98494089300505%

It seems that there are four indices that have problems.
All the green ones show:
{"status":"green","number_of_shards":5,"number_of_replicas":1,"active_primary_shards":5,"active_shards":10,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0},"filebeat-6.5.1-2016.05.14":

While the four yellow ones show:
{"status":"yellow","number_of_shards":5,"number_of_replicas":1,"active_primary_shards":5,"active_shards":9,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":1},"filebeat-6.5.1-2016.05.13":

So it has 1 active_shard missing on each of these four indices.
This is also shown on _cluster/health, it shows 4 unassigned shards.

This is the output from allocation:
{
"index": "filebeat-6.5.1-2017.09.20",
"shard": 3,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "ALLOCATION_FAILED",
"at": "2018-12-13T02:46:15.515Z",
"failed_allocation_attempts": 5,
"details": "failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "RuY8hzATQLimTu4IQ0c--Q",
"node_name": "es01",
"transport_address": "172.20.0.4:9300",
"node_attributes": {
"ml.machine_memory": "37779542016",
"xpack.installed": "true",
"ml.max_open_jobs": "20",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "max_retry",
"decision": "NO",
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2018-12-13T02:46:15.515Z], failed_attempts[5], delayed=false, details[failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ], allocation_status[no_attempt]]]"
}
]
},
{
"node_id": "uEcPa_WSTF2hpMgqXIC1Ww",
"node_name": "es02",
"transport_address": "172.20.0.3:9300",
"node_attributes": {
"ml.machine_memory": "37779542016",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "max_retry",
"decision": "NO",
"explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2018-12-13T02:46:15.515Z], failed_attempts[5], delayed=false, details[failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ], allocation_status[no_attempt]]]"
}
]
},
{
"node_id": "utNZZop8SRuCuX_KZffjuw",
"node_name": "es03",
"transport_address": "172.20.0.5:9300",
"node_attributes": {
"ml.machine_memory": "37779542016",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [

    {
      "decider": "max_retry",
      "decision": "NO",
      "explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2018-12-13T02:46:15.515Z], failed_attempts[5], delayed=false, details[failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2017.09.20][3]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{lDLyWs5ZR1mJHQ1pgIOQVw}{172.20.0.5}{172.20.0.5:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{BF-RbwnKRzecVVNE_s8cvw}{172.20.0.3}{172.20.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.20.0.5:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.20.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: TranslogCorruptedException[translog from source [/usr/share/elasticsearch/data/nodes/0/indices/9AHmqT9SRECymSKzcGZH1w/3/translog/translog-21.tlog] is corrupted, expected shard UUID [35 53 30 78 55 35 35 70 51 69 71 66 47 6a 67 41 72 4a 36 59 5a 41] but got: [71 42 70 57 69 52 7a 6e 52 4a 75 69 49 70 75 67 4b 34 43 51 36 67] this translog file belongs to a different translog]; ], allocation_status[no_attempt]]]"
    },
    {
      "decider": "same_shard",
      "decision": "NO",
      "explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[filebeat-6.5.1-2017.09.20][3], node[utNZZop8SRuCuX_KZffjuw], [P], s[STARTED], a[id=TdtGlLWDQoCxYo5yM96dyw]]"
    }
  ]
}

]
}

I do not know quite how we ended up in this state, but the simplest fix would be to set the number_of_replicas on each of the affected indices to 0, wait for the cluster to report its health as GREEN and then set it back to 1 to rebuild those replicas again.

1 Like

I see, that makes perfectly sense.
Being very new in the elasticsearch world, could you explain me how to that? :wink:

Of course:

PUT /filebeat-6.5.1-2017.09.20/_settings
{"number_of_replicas":0}

(do the same for the other yellow indices, replacing filebeat-6.5.1-2017.09.20 with the index name). Your cluster should now report it's health as green, according to:

GET /_cluster/health

Then restore the replicas with this:

PUT /filebeat-6.5.1-2017.09.20/_settings
{"number_of_replicas":1}

(again, do the same thing for the other indices)

1 Like

Hi David

He, while I was learning and you were explaining, it has solved some of it.
I now see this message:
{"index":"filebeat-6.5.1-2015.04.11","shard":1,"primary":false,"current_state":"unassigned","unassigned_info":{"reason":"ALLOCATION_FAILED","at":"2018-12-17T16:34:13.467Z","failed_allocation_attempts":5,"details":"failed shard on node [uEcPa_WSTF2hpMgqXIC1Ww]: failed recovery, failure RecoveryFailedException[[filebeat-6.5.1-2015.04.11][1]: Recovery failed from {es03}{utNZZop8SRuCuX_KZffjuw}{WkcmaW8vSiWMegcpeiTQmA}{172.22.0.2}{172.22.0.2:9300}{ml.machine_memory=37779542016, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {es02}{uEcPa_WSTF2hpMgqXIC1Ww}{Le9IxadgSraVPcqzh0Np_A}{172.22.0.3}{172.22.0.3:9300}{ml.machine_memory=37779542016, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[es03][172.22.0.2:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[es02][172.22.0.3:9300][internal:index/shard/recovery/prepare_translog]]; nested: NoSuchFileException[/usr/share/elasticsearch/data/nodes/0/indices/ThT-y0wiTraU-tyAq4ML4Q/1/translog/translog.ckp]; ","last_allocation_status":"no_attempt"},"can_allocate":"no","allocate_explanation":"cannot allocate because allocation is not permitted to any of the nodes","node_allocation_decisions":[{"node_id":"RuY8hzATQLimTu4IQ0c--Q","node_name":"es01","transport_address":"172.22.0.4:9300","node_attributes":{"ml.machine_memory":"37779542016","xpack.installed":"true","ml.max_open_jobs":"20","ml.enabled":"true"},"node_decision":"no","deciders":[{"decider":"max_retry","decision":"NO","explanation":"shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED],
...

So I did a
curl -XPOST http://elasticsearch.onead.dk:9200/_cluster/reroute?retry_failed=true
And now it's gone to green, and it shows
{
"cluster_name" : "docker-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 13341,
"active_shards" : 26682,
"relocating_shards" : 2,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 3,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 4837,
"active_shards_percent_as_number" : 100.0,

So I think it is on it's way to recovery

That is a very, very high number of shards given the size of your cluster. Please read this blog post which provides some practical guidelines arounds shard sizes and sharding. Then look to change your sharding strategy and bring down the number of shards significantly.

Hi Christian

Thankyou for the link. So far we've just had filebeat pump in logfiles to logstash, which then sends it to ES. And that will be out first primary focus, as the logfiles are unusable themselves, but with ES it gives us solid insights to trends.
The problem with the logfiles is that they are not good, so we have a filebeat doing a 'multiline' interpretation, and then logstash putting the values into the right fields.

But as you say, it looks like that logstash sends to a new shard for each file?
So should I just manually merge them or can I make the logstash or ES know that it should do it automatically?

If you can show your Logstash configuration we may be able to help spot any issues.

input {
    beats {
            port => 5000
    }
}

#############################################
#  Winlogbeat does not require processing   #
#############################################



#############################################
#  Interpret Portrait filebeats log         #
#############################################

filter {
    if "portrait" in [tags] {
    kv {
                    source => "message"
                    #target => "kvpairs"
                    value_split => ":"
                    field_split => "\n"
                    include_keys => [ "Time", "Username", "Instance name", "Exception", "Exception type", "ErrorCode", "Module", "Procedure", "Version", "Memory usage" ]
    }
    date {
            match => [ "Time", "MM/dd/yyyy hh:mm:ss aa", "M/d/yyyy hh:mm:ss aa", "dd-MM-yyyy HH:mm:ss", "d-M-yyyy HH:mm:ss", "MM/dd/yyyy", "M/d/yyyy", "dd-MM-yyyy", "d-M-yyyy" ]
            timezone => "Europe/Copenhagen"
    }
}
}

#############################################



output {
#  stdout {}

      elasticsearch {
          hosts => "elasticsearch:9200"
              manage_template => false
                  index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
                      document_type => "%{[@metadata][type]}"
        }

}

So as this shows it creates an index for each logfile, and each of them has 5 shards when I look into the cluster health thing.

That looks fine to me. I do not see how that can generate that many shards unless you have uploaded data going back several years. If you are indeed storing data for a very long time period, it would probably make sense to switch to monthly indices instead and also possibly reduce the number of primary shards per index.

We do have logfiles going back 5-6 years for 25-30 "logfile producers", which has revealed very interesting problems that we were never aware of before we started this.
So could I just modify the 'index' line in logstash to

                  index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM}"
                      document_type => "%{[@metadata][type]}"

?
If that works, then I see two options, going through all these indexes and merging them, or kill all the data and reimport it, I guess there is no automatic way of merging the existing ones is there?

That looks correct. You should also look at setting the number of primary shards through an index template. Converting daily indices into monthly will however require reindexing. Given the number of indices you have you will probably need to script this.

1 Like

Ok, so if I change the index to this:
index => "%{[@metadata][tags]}-%{[@metadata][beat]}-%{+YYYY.MM}"

It will identify the logfiles, and then I could create a template like this:

PUT _template/portrait
{
    "index_patterns": ["portrait*"],
    "settings": {
    "number_of_shards": 1
 }
}

right?

If your indices match the pattern portrait* that looks correct, even though that does not seem to be the case based on the config you shared earlier.

1 Like

Yes, true, but it should mach them going forward with the index set in this post, right?
index => "%{[@metadata][tags]}-%{[@metadata][beat]}-%{+YYYY.MM}"
And then I need to create a script that will run through all the indices and merge them to new indices that match the naming set here.

Seems like I can't use tags directly in the index name in output.
So for now I'm going to just limit me to changing the daily to monthly :slight_smile:
Thankyou VERY much for all the insights Christian!

And of course thankyou David for helping me solving the original issue.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.