I have an ES cluster where I store a few logs. Because these logs are not really important, I only have primary shards for them. Today a node of this cluster died. Because we don't have any replicas and I was using an ephemeral disk, we lost some data and cluster status became red. To recover, I deleted all lost indices using ES API and it got green again.
Is it possible to still have a green cluster even if I lost some data, avoiding explicit human action?
A cluster can be green if and only if all shards of all the indexes are present. If you don't have any replicas and lose a single primary shard, then there's no way for that index to be complete, you're missing some data, and hence, your cluster cannot be green. What would be the value of green if the cluster was green even if you're missing data?
You either need to delete your index as you did or restore a snapshot in order to get back to green.
To be honest, I've not tried that case, but your indexing code (using bulk or otherwise) would probably be notified that something went wrong with the index action and it could send another indexing request towards another index.
It's easy to try it out, though. Create an empty index with one shard, stop the cluster, delete the shard on the file system, restart the cluster and see what happens.
I just tried and ES will prevent any write operation to a RED index, which makes sense.
Basically, this is like ES telling you: "hey, instead of me losing your documents because I cannot store them anywhere, I prefer to not index anything into that index and you'll have to retry later when you have solved the issue"
The result of trying to index a document into an index which is missing one or more shards is
[2018-04-27T17:19:22,058][WARN ][r.suppressed ] path: /test/test/1, params: {pipeline=test, index=test, id=1, type=test}
org.elasticsearch.action.UnavailableShardsException: [test][1] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][1]] containing [index {[test][test][1], source[{"test":1}]}]]
In the log, you'll see that I've tried to use an ingest pipeline to see if it was possible to catch the issue and change the _index field to some other index, but that didn't work.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.