I'm a newbie with ES, and I try hardening it in order to avoid failure.
Last week, I encountered the famous issue where ES locks indexes when it reaches less than 5% of disk storage. I've deleted old indexes, put the setting in order to ask ES to allow input again, and everything went back to normal. Unfortunately, every input tried during this time were lost.
Right now I try to find the best way to avoid this in the future, as my input data is growing rapidly.
On front on logstash, I'll put a Redis/Kafka cluster to load-balance my input to multiple logstash instances. Do you think that enabling persistent queue will be enough on the logstash side ?
Rather than adding the disk capacity to Logstash / Kafka / Redis, why are you not adding it to Elasticsearch instead? IMO you are adding a component and move the problem around, but it doesn't solve the issue.
Let's say you have 1TB disk capacity on Elasticsearch. Once that is full, additional data can sit in the queue; let's say you have 100GB there. But once you have filled up 1.1TB (regardless if it's all in Elasticsearch or 1TB on Elasticsearch and the rest on the queue), your data can't move anywhere again.
In the end you'll need proper monitoring of your disk space and clean up accordingly. Today Elastic Curator can help you with that. Soon there will also be Index Lifecycle management.
PS: You can change the 5% threshold, but this is just slightly moving your problem around and not solving the underlying issue.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.