FORBIDDEN/8/index write (api) - During date change on indexes

Hey,

I'm seeing some weird issues and yet to find what is causing this behaviour. I have various indexes using the name-%{+YYYY.MM.dd} index pattern. When the indexes are being updated (i.e. today's index is created), it somehow produces an write error on indexes causing them all to go to read-only and therefore stop the ingesting of logs.

The error I see from Logstash

[2019-05-02T00:59:49,862][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
[2019-05-02T00:59:49,863][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
[2019-05-02T00:59:49,863][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
[2019-05-02T00:59:49,864][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>3}
[2019-05-02T00:59:49,873][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
[2019-05-02T00:59:49,873][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})

Elasticsearch logs don't show much unfortunately. The most I see across all nodes in the cluster is:

[2019-05-02T00:00:13,246][INFO ][o.e.c.m.MetaDataIndexTemplateService] [node-3] adding template [kibana_index_template:.kibana] for index patterns [.kibana]
[2019-05-02T01:00:00,020][INFO ][o.e.x.m.e.l.LocalExporter] [node-3] cleaning up [1] old indices
[2019-05-02T01:00:00,021][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node-3] [.monitoring-logstash-6-2019.04.24/di6K_TjSSvmme6TrZfPFVg] deleting index
[2019-05-02T01:00:00,322][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-3] [ciscoasa-2019.05.02] creating index, cause [auto(bulk api)], templates [ciscoasa-], shards [3]/[1], mappings [doc]
[2019-05-02T01:00:00,518][INFO ][o.e.c.m.MetaDataMappingService] [node-3] [ciscoasa-2019.05.02/0gsB8vICSuu95M28nLsM4A] update_mapping [doc]
[2019-05-02T01:00:01,026][INFO ][o.e.c.r.a.AllocationService] [node-3] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[ciscoasa-2019.05.02][0]] ...]).
[2019-05-02T01:00:02,477][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-3] [.monitoring-es-6-2019.05.02] creating index, cause [auto(bulk api)], templates [.monitoring-es], shards [1]/[0], mappings [doc]
[2019-05-02T01:00:02,478][INFO ][o.e.c.r.a.AllocationService] [node-3] updating number_of_replicas to [1] for indices [.monitoring-es-6-2019.05.02]
[2019-05-02T01:00:03,437][INFO ][o.e.c.r.a.AllocationService] [node-3] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.monitoring-es-6-2019.05.02][0]] ...]).  
[2019-05-02T01:00:09,222][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-3] [.monitoring-logstash-6-2019.05.02] creating index, cause [auto(bulk api)], templates [.monitoring-logstash], shards [1]/[0], mappings [doc]
[2019-05-02T01:00:09,222][INFO ][o.e.c.r.a.AllocationService] [node-3] updating number_of_replicas to [1] for indices [.monitoring-logstash-6-2019.05.02]
[2019-05-02T01:00:09,469][INFO ][o.e.c.r.a.AllocationService] [node-3] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.monitoring-logstash-6-2019.05.02][0]] ...]).
[2019-05-02T01:00:13,434][INFO ][o.e.c.m.MetaDataCreateIndexService] [node-3] [.monitoring-kibana-6-2019.05.02] creating index, cause [auto(bulk api)], templates [.monitoring-kibana], shards [1]/[0], mappings [doc]
[2019-05-02T01:00:13,435][INFO ][o.e.c.r.a.AllocationService] [node-3] updating number_of_replicas to [1] for indices [.monitoring-kibana-6-2019.05.02]
[2019-05-02T01:00:13,702][INFO ][o.e.c.r.a.AllocationService] [node-3] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.monitoring-kibana-6-2019.05.02][0]] ...]).

(I'm assuming by the times here, its because I'm in BST therefore +1 to UTC)

When I check the index.blocks.write on all indexes, only 2 of 5 had been set to true yet all indexes were not being ingested.
After updating these values back to false, everything started working again.

Now I know from having a look online that the error I received FORBIDDEN/8/index write (api) can relate to low memory or low disk space however I've increased resources for both and no where near any limits set therefore ruling this out as being the cause.

The other weird behaviour I'm seeing is that I am able to successfully create an index manually using the same index pattern ruling out any block on creating new indexes:

curl -XPOST -H "Content-Type: application/json" "http://localhost:9200/ciscoasa-2019.06.01/doc" -d "{ \"field\" : \"value\"}" # random date in future to ensure the template is matched

Result:

{
  "_index" : "ciscoasa-2019.06.01",
  "_type" : "doc",
  "_id" : "bMLDd2oBbf8VYDve6pJ_",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

The reason I've tagged this post with index-lifecycle-management is that I'm not sure if this has a part to play in it.
The index shown in the logs has the following policy attached to it.

{
    "policy": {
        "phases": {
            "hot": {
                "min_age": "0ms",
                "actions": {
                    "set_priority": {
                        "priority": 100
                    }
                }
            },
            "warm": {
                "min_age": "1d",
                "actions": {
                    "forcemerge": {
                        "max_num_segments": 1
                    },
                    "set_priority": {
                        "priority": 50
                    }
                }
            },
            "delete": {
                "min_age": "15d",
                "actions": {
                    "delete": {}
                }
            }
        }
    }
}

I'm wondering if the warm phase is kicking in early and therefore setting an index to read-only too soon or something?

Has anyone seen this behaviour before or know why this might be happening on a nightly basis.

Any help would be appreciated.

Cheers,
Kev

What is the size of your cluster? How many indices and shards do you have in it?

3 node cluster
164 indices
1021 shards

900GB of space , using around 140GB currently

ILM doesn't automatically set the index in read-only for the warm phase. The only thing that would is if you had an explicit read_only action in the phase, or if you were using the shrink action (which sets the index to read-only because it is a prerequisite for shrinking).

So I take it, a force merge doesn't have a prerequisite to set the index to read-only?

Any other ideas on what this could be?

So I made two changes yesterday;

  1. Updated elasticsearch outputs to include manage_template => false
  2. Update Index Lifecycle Management warm phase min_age value to 25h

This has stopped the issue from occurring but looking to confirm which of the above affected this. I'd like to say the ILM change as the elasticsearch output had been working fine without the manage_template parameter and it was ILM that was recently updated.

Can anyone confirm?

Nope, no prerequisite for that.

I don't think either of those would cause anything related to the index being read-only. Is it possible that your node filled up with disk space and the index was marked read only because of that?

@dakrone no, this wasn't the case as there is plenty of resources available (see earlier reply with disk usage)

Please also see stats below (last 2 nodes are client nodes hence no shards)

This still appears to be occuring, however at different times now. What's also weird is that its not all the indices.

curl -s -XGET -H "Content-Type: application/json" 'http://localhost:9200/%2A/_settings/index.blocks.write?pretty' | grep -B4 true
  "ciscoasa-2019.05.13" : {
    "settings" : {
      "index" : {
        "blocks" : {
          "write" : "true"
--
  "esxi-2019.05.13" : {
    "settings" : {
      "index" : {
        "blocks" : {
          "write" : "true"
--
  "heartbeat-6.7.0-2019.05.13" : {
    "settings" : {
      "index" : {
        "blocks" : {
          "write" : "true"
--
  "winlogbeat-6.7.0-2019.05.13" : {
    "settings" : {
      "index" : {
        "blocks" : {
          "write" : "true"
--
  "filebeat-6.7.0-2019.05.13" : {
    "settings" : {
      "index" : {
        "blocks" : {
          "write" : "true"

Even though the above 5 have write set to true, there is another 4 indices from the same date that do not have this parameter set.

I have two logstash servers with the same input/filter/output parameters but only one appears to be showing these FORBIDDEN errors in the logs. These are still occuring even though indices with today's date exist and the other logstash server is writing to these without issues.

[2019-05-14T10:06:19,749][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
[2019-05-14T10:06:19,749][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})

In order to stop these errors and correct this, I have to set these index.blocks.write to false.

curl -X PUT -H "Content-Type: application/json" 'http://localhost:9200/%2A/_settings' -d '{ "index": { "blocks": { "write": "false" } } } '

Any ideas why this could be occurring so often when resources are fine?

After analysing this further in my last reply, it was intriguing that only one logstash server was producing these errors.
Looking into it further, I noticed under the Logstash monitoring, the JVM heap size was at 92% on the affected node which reading up on this stackoverflow article, seems to suggest this lock can occur.
Originally, I did come across this article however it was only the elasticsearch jvm options that I had increased.

Now that I've increased Logstash jvm options, I'll see if this re-occurs.

So I updated ILM warm phase back to min_age of 24h and have experienced the same FORBIDDEN error in the logs and being unable to write to the indices.
@dakrone you sure there's no prerequisite as the above behaviour says differently

EDIT: Just found out a force_merge does indeed set the index.blocks.write to true as confirmed here

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.