Dear ELK stack specialists i am having a weird behavior with my ELK stack system. The way i set up retention policy is that all the indexes older than 14 days are being closed and all the indexes older than 3 months are being deleted. I use elasticsearch-curator package for it and it works as expected. However, if any of my filebeat indexes are closed then logstash is trying to publish events into the closed index(es) and eventually fails since its pipeline getting overloaded and it becomes stale. By opening all the indexes is fixing that issue but i am wondering if there is a way to fix it permanently rather than applying a work around. The reason i close indexes is to save some RAM resources. I also user meticbeat and have no problems with closed metricbeat indexes (just for filebeat). Here are all the config files i have:
1) error message: [2017-12-26T06:24:58,293][INFO ][logstash.outputs.elasticsearch] Retrying individual bulk actions that failed or were rejected by the previous bulk request. {:count=>1}
[2017-12-26T06:24:58,293][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 403 ({"type"=>"index_closed_exception", "reason"=>"closed", "index_uuid"=>"U2IoWDefQLan4DL4HxgPxA", "index"=>"filebeat-2017.11.28"})
2) /etc/logstash/conf.d/30-elasticsearch-output.conf:
output {
elasticsearch {
hosts => ["localhost:9200"]
sniffing => true
manage_template => false
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
}
}
3) /opt/elasticsearch-curator/curator-action.yml
---
actions:
1:
action: close
description: close indices
options:
delete_aliases: False
timeout_override:
continue_if_exception: True
disable_action: False
filters:
-
filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 14
exclude:
2:
action: delete_indices
description: delete indices
filters:
-
filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 90
exclude:
This looks very similar to this topic. Please avoid opening multiple threads for the same issue.
Keeping indices open in order to be able to query and write to them is how it is supposed to work, so I would not classify that as a workaround. It sounds like you may be doing something in a non-optimal way.
What is your sharing strategy as you need to close indices that quickly? How many indices/shards are you creating per day? What is your average shard size?
Why do you have data coming in with such an extreme delay?
Thank you for your reply Christian. I did see this similar topic but it seems to be dead (no replies). Correct me if i am wrong but i was reading that it is a less load on RAM by having less opened indices and this is the main reason why i close them. I have only two indices being created a day (one fore filebeat and one for metricbeat) and it's average size is ~700MB for filebeat and ~900MB for metricbeat. I don't see the data coming with delay on Kibana but for some reason it tries to publish it in every index/indice i have. And i don't know why.
What does your Beats configuration look like? Do you send data directly from The Beats to Elasticsearch?
At that data volume you should also consider creating an index template that sets the number of primary shards to 1 as well, which will save heap space.
Closing indices does reduce heap usage but comes with a number of drawbacks as you have noticed. Closing indices is often a workaround for clusters that struggle, although rarely a good long term solution.
Hey Christian thank you again for a quick reply. I am sending beats data to Logstash and then logstash is passing it to Elasticsearch. How do i create this index template that sets the number of primary shards to 1? I know i loaded .json beats templates to ealsticsearch (filebeat-index-template.json). I think i will stop closing indices from now on and monitor a system's resources for a some while. However, if there are any other methods as you mentioned for a example to reduce the heap usage it would be nice to implement it.
Here is my filebeat config for example:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.