Logstash cannot index data to Elasticsearch [FORBIDDEN/8/index write (api)]

I'm trying to use logstash to index data into Elasticsearch. At first, it indexes successfully but after a few moments, when I tail tailf /var/log/logstash/logstash-plain.log, it gives me this error:

retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}
retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}
retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}

My logstash output config:

    elasticsearch {
        hosts => ["127.0.0.1:9200"]
        index => "logstash-%{+yyyy.MM.dd}-%{shift}"
        manage_template => false
    }

My elasticsearch config:

bootstrap.memory_lock: true
cluster.name: elasticsearch
cluster.routing.allocation.awareness.attributes: machine
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts: ["127.0.0.1:9301"]
path.repo: ["/home/repository"]

node.master: true
node.data: true
node.ingest: false
node.attr.box_type: warm
node.name: hot_es01
node.attr.machine: ${HOSTNAME}

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-methods: GET, POST, PUT
http.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300

what possibilities can cause this?
I can think of read_only indexes, ES out of disk. But I checked and they are not.
Please help! Thanks.

Is it possible that you have been near the disk limit earlier? If that happens and shards cannot be moved around, those shards are set to read-only. You can unset this manually, see the docs at https://www.elastic.co/guide/en/elasticsearch/reference/6.6/disk-allocator.html

I'm pretty confident that the disk has never reached its limit.
I deployed ES with docker-compose

    elastic5:
        extends:
            file: common.yml
            service: common
        container_name: elastic5
        image: elasticsearch:5.6.10
        hostname: elastic5
        network_mode: "host"
        volumes:
            - /u01/elastic5:/usr/share/elasticsearch/data:rw
            - /u01/repository:/home/repository:rw
            - ./elasticsearch/hot/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
        environment:
            - "ES_JAVA_OPTS=-Xms2g -Xmx2g -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp"

And the volume /u01 is 827G available /dev/sdb1 1008G 131G 827G 14% /u01

Is there any things that logstash can do to make elasticsearch blocks it? Like indexing an event that its fields do not match to Elasticsearch template?
Thanks!

The disk threshold causes a block called index read-only / allow delete (api) which is different from this one. AFAIK this block is never applied automatically - the only way this block can arise is if someone or something sets the index to read-only through the API, either by setting index.blocks.write: true or by freezing the index (which I think also sets index.blocks.write: true).

4 Likes

Thank you David,
Set blocks.write to false actually works, but I never set it to true before.
I think the reason that caused it is wrong mapping in ES template.
Whenever my logstash pushes an event that has @timestamp field, ES will block my logstash from indexing into it:
{"@timestamp": "2019-02-27T02:27:22.571Z", "event_id": "a510982a-79dc-4104-a670-3ea5911ceb6c"}
My template for @timestamp:

        "@timestamp": {
          "format": "dateOptionalTime",
          "index": "not_analyzed",
          "type": "date"
        }

And when I remove @timestamp from my event, I can index it to ES successfully.
Please tell me what is wrong with this field and its template. Thanks.

The docs say that the format is called date_optional_time not dateOptionalTime.

But I don't think that the consequence of this would be that the index is set to read-only.

I am very confusing right now.
It is not actually set to read-only. I still can index into it using ES API. It only blocks my logstash instance to index into it (even events that don't have @timestamp field can't be indexed into ES). But after I set index.blocks.write: false, my logstash can index other events into ES and generate these logs:

retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/8/index write (api)];"})
Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}


[Consumer clientId=logstash-3, groupId=logstash2] Synchronous auto-commit of offsets {logsSIRC-6=OffsetAndMetadata{offset=1887, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
[Consumer clientId=logstash-0, groupId=logstash2] Synchronous auto-commit of offsets {logsSIRC-0=OffsetAndMetadata{offset=1877, metadata=''}, logsSIRC-1=OffsetAndMetadata{offset=1956, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
[Consumer clientId=logstash-3, groupId=logstash2] Revoking previously assigned partitions [logsSIRC-6]
[Consumer clientId=logstash-3, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-0, groupId=logstash2] Revoking previously assigned partitions [logsSIRC-0, logsSIRC-1]
[Consumer clientId=logstash-0, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-2, groupId=logstash2] Synchronous auto-commit of offsets {logsSIRC-4=OffsetAndMetadata{offset=1947, metadata=''}, logsSIRC-5=OffsetAndMetadata{offset=1906, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
[Consumer clientId=logstash-2, groupId=logstash2] Revoking previously assigned partitions [logsSIRC-4, logsSIRC-5]
[Consumer clientId=logstash-2, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-4, groupId=logstash2] Synchronous auto-commit of offsets {logsSIRC-7=OffsetAndMetadata{offset=1952, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
[Consumer clientId=logstash-4, groupId=logstash2] Revoking previously assigned partitions [logsSIRC-7]
[Consumer clientId=logstash-1, groupId=logstash2] Synchronous auto-commit of offsets {logsSIRC-2=OffsetAndMetadata{offset=1899, metadata=''}, logsSIRC-3=OffsetAndMetadata{offset=1882, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
[Consumer clientId=logstash-1, groupId=logstash2] Revoking previously assigned partitions [logsSIRC-2, logsSIRC-3]
[Consumer clientId=logstash-1, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-4, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-1, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-0, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-3, groupId=logstash2] (Re-)joining group
[Consumer clientId=logstash-3, groupId=logstash2] Successfully joined group with generation 33
[Consumer clientId=logstash-0, groupId=logstash2] Successfully joined group with generation 33
[Consumer clientId=logstash-3, groupId=logstash2] Setting newly assigned partitions [logsSIRC-6]
[Consumer clientId=logstash-0, groupId=logstash2] Setting newly assigned partitions [logsSIRC-0, logsSIRC-1]
[Consumer clientId=logstash-2, groupId=logstash2] Successfully joined group with generation 33
[Consumer clientId=logstash-2, groupId=logstash2] Setting newly assigned partitions [logsSIRC-4, logsSIRC-5]
[Consumer clientId=logstash-1, groupId=logstash2] Successfully joined group with generation 33
[Consumer clientId=logstash-1, groupId=logstash2] Setting newly assigned partitions [logsSIRC-2, logsSIRC-3]
[Consumer clientId=logstash-4, groupId=logstash2] Successfully joined group with generation 33
[Consumer clientId=logstash-4, groupId=logstash2] Setting newly assigned partitions [logsSIRC-7]

PS: changing format to date_optional_time does not work too

Hmm, are you running Elasticsearch using the AWS Elasticsearch Service? I just found in their docs that they apply this block if your instance is persistently low on memory.

Other than that, I can't think of a mechanism that would cause this block to be put in place. I don't expect that Logstash would be applying it, but I don't know Logstash very well so it's possible. Could you ask in the Logstash forum?

No, I'm running it on my local server.
I'll try to ask in Logstash forum.
Thanks for your replies. ^^

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.