A few months ago I spent a long time developing an app, in Python with a Rust module, to do automatic indexing of all the text documents under a particular directory in my system: these text documents (Word, etc.) are sliced up into overlapping 10-paragraph segments, each one then being made into a "Lucene Document" for the purpose of indexing.
A few days my app stopped running. I found that indices had "red" status. A PUT request to attempt to create a new index (I'm using an alias setup) failed with "bad response". Deleting all existing indices did not work. Restarting the ES service, or rebooting, did not help.
But I had previously come across this high watermark problem. I went to the drive in question and ruthlessly deleted 10s of GB of storage.
I try again: works!
Is there any particular way, from within ES, of diagnosing that the High Watermark has been reached for the disk containing the ES indices, or that a disk is approaching it? Presumably the percentage of storage must be configured somewhere, so I suppose it could be as simple as calculating how congested your disk is and comparing it with that configured percentage...
Thanks. I wondered whether there might indeed be other things which could go wrong and lead to mystifying (for someone my level) failures. It'd be great to receive any alerts about anything.
So it would be helpful to implement that. But looking at that page I don't actually understand how I would go about implementing it. I don't use Kibana, only Elasticsearch.
Could you maybe point to another page which gives some clues about how to apply these "alerting rules"?
Also it says these rules are "preconfigured". That doesn't necessarily mean "already implemented" of course. But maybe alerts are already being triggered somewhere?
I did say I was low low intermediate. I need a few more pointers about how all this is used in practice.
As a self-admitted noob I would get Kibana running because that is considered the management interface.
Yes, you can almost do all of this through elasticsearch REST API but learning what the API calls look like, it's often easier to do that through the Kibana interface and see samples.
In particular, setting up rules (AKA alerts)and connectors is not trivial through the elasticsearch REST API IS not trivial in any way.
Pre-Configured does not mean already running.
If you're running self-managed then no.
My recommendation is to configure and roll out Kibana configure some of these rules, then use the rest interface to see what they look like. Then you can use that if you like.
Yes, but more generally if the cluster is in red or yellow health then this is the troubleshooting guide to follow. It leads you towards using the cluster allocation explain API which would immediately tell you about a disk space problem (as well as any other issues that might be blocking shard allocation).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.