Hi,
We setup trial version of ELK stack (6.6.2) ( with 2 nodes of logstash, 3 nodes of Elastic cluster, 1 node of Kibana) and configure to flow messages from our Firewalls => Logstash => Elastic Search => Kibana.
logstash runs fine for a few hours but it crashes with the below-mentioned error and this is occurring frequently. This is not because of disk space problem. In all the nodes there is sufficient disk space available.
whenever we encounter this issue, running following command on each ES cluster node and restarting of all ES Nodes and logstash nodes will solve the issue.
curl -XPUT -H "Content-Type: application/json" http://<>:9200/_all/_settings -d '{"index.blocks.read_only_allow_delete": null}'
Came to know this is because of Indexes are getting locked...
So i wanted to know more on following.
- Why indexes are getting locked How can we avoid this?
- Do you suggest a better approach/configuration of certain elastic parameters to avoid indexes getting locked?
- When all nodes started working, logstash tries to send all messages that are in memory and end up with info items such as mentioned in the below section "Buffered Entries". Will te logstash send every message that is received and buffered or Are there any chances of losing data?
We are not using any persistent buffers on the logstatsh front.
Error Deatils:
"[2019-03-28T00:22:54,815][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T00:22:54,815][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>125}
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,316][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"cluster_block_exception", "reason"=>"blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"})
[2019-03-28T23:12:09,317][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>71}
Missing of Data
Here is the snapshot of last 7 days of data. If you observe Inbetween there is missing of data. If we restart it worked some times.
Disk Space availability
- The data coming from some firewalls in a day can vary from few MBs to GBs ex 1.6 or 1.7 Gb too. Initially logged but turned off writing to a file .
Buffered Entries
[2019-03-28T18:28:36,854][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [101889343][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[logstash-pfw01-vsys1-2019.03.28][2]] containing [9] requests, target allocation id: GX4dk7HHTm2pTBg3xbefbw, primary term: 1 on EsThreadPoolExecutor[name = es-node1/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@ea77817[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 89000938]]"})
[2019-03-28T18:28:36,854][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [101889361][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[logstash-pfw01-vsys1-2019.03.28][3]] containing [9] requests, target allocation id: qSjfpiMkSH6MikM6LjZwcg, primary term: 1 on EsThreadPoolExecutor[name = es-node1/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@ea77817[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 89000944]]"})
[2019-03-28T18:28:36,855][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 429 ({"type"=>"es_rejected_execution_exception", "reason"=>"rejected execution of processing of [101889332][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[logstash-vfw07-vsys1-2019.03.28][0]] containing [7] requests, target allocation id: xBhWfCN8SlugOEeMDVo55A, primary term: 1 on EsThreadPoolExecutor[name = es-node1/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@ea77817[Running, pool size = 4, active threads = 4, queued tasks = 200, completed tasks = 89000938]]"})
Any help in this regard is highly appreciated.
Thanks,
Karunya