ELK - indexes blocked : SERVICE_UNAVAILABLE/1/state not recovered

Hello,
Lately, I upgraded to elastic 6.7.1 (previously 6.2.4) I also upgraded kibana. Suddently I had an exception about monitoring. I searched on the internet and found that solution :

PUT /_cluster/settings
{
"persistent" : {
"xpack.monitoring.collection.enabled" : true
}
}

And it worked for several days, until this morning at 01:50... I have an agent that creates snapshots for new indexes, and removes old one... (after snapshoting the index for the past day, I remove the indexes older than 12 month)

Since 01:50, logstash can't bulk any new data. The indexes are blocked. I tried restarting but it changes nothing.

I'm working with only one machine, no cluster...

I will attach my logs... Could you please help me, I tried everything I could... I searched everywhere I know...
Since my litte problem, in only twelve hours, elastic logged 554 log files...
I will only attach the first one (all the others are the same)

I created a WeShare link... https://we.tl/t-qJ2o2SUq3O

thanks for you help...

Just in case, if you need it, here is the log file for my first reboot... kibana has a lot of problems with monitoring...
WeTransfer link : https://we.tl/t-yp7ZDVj7pE

Search for "closed" to find the right place to begin... I rebooted à 13:01

I'm really sorry to send you my log files, like this, but I really don't known what to do...

The post title says SERVICE_UNAVAILABLE/1/state not recovered but I do not see this kind of block mentioned in the log files you shared. I do, however, see a different kind of block, because you almost ran out of disk space:

[2019-05-03T01:49:50,228][WARN ][o.e.c.r.a.DiskThresholdMonitor] [QHo9DFx] flood stage disk watermark [95%] exceeded on [QHo9DFxYToCEiDvtNuwvQA][QHo9DFx][/var/lib/elasticsearch/nodes/0] free: 14.7gb[4.9%], all indices on this node will be marked read-only
[2019-05-03T01:50:15,174][INFO ][o.e.s.SnapshotsService   ] [QHo9DFx] snapshot [backup_repo_logstash-portail-catalina-out:snapshot-2018.10.22/2x-tRB6ET7KoaNPhP5JdXQ] started
[2019-05-03T01:50:20,423][WARN ][o.e.c.r.a.DiskThresholdMonitor] [QHo9DFx] flood stage disk watermark [95%] exceeded on [QHo9DFxYToCEiDvtNuwvQA][QHo9DFx][/var/lib/elasticsearch/nodes/0] free: 14.5gb[4.9%], all indices on this node will be marked read-only
[2019-05-03T01:50:22,369][WARN ][o.e.x.m.e.l.LocalExporter] [QHo9DFx] unexpected error while indexing monitoring document
org.elasticsearch.xpack.monitoring.exporter.ExportException: ClusterBlockException[blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];]

Ah ok I see a few SERVICE_UNAVAILABLE/1/state not recovered in the second log you shared, shortly after you restarted the node. This is normal behaviour while a node is restarting. Your real problem is the disk space one I mentioned above.

Thanks for you help... I known that I have this disk problem, but it's a very big disk... It's 300Go... And there is still 10Go free...
Because of this space problem, I created the agent that I mentionned on my first post.
I executed it by hand. Now, there are 40Go free. But I still have the problem.
How can I solve it? What is the first step?

My logstash log says : [2019-05-03T01:50:18,727][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})

I think I found the solutions for the kibana monitoring problem. I've executed this :

PUT _settings
{
"index": {
"blocks": {
"read_only_allow_delete": "false"
}
}
}

Could you please explain me why I had to do it?

Sure, from the docs I linked above:

The index block must be released manually once there is enough disk space available to allow indexing operations to continue.

300GB is quite a small disk, and 10GB is less than 4% free. By default Elasticsearch makes indices read-only when the free space drops below 5%. This allows a small amount of extra space for recovery.

thanks... I don't have the solution, but you should think about having error message easier to read...
For exemple : give possible solution to problems... I searched everywhere for my problem and usually I don't need to bother people on forums. But this time I really couldn't find the problem and I had no idea about where to start...

Thanks for your help

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.