How to create Alerts for cluster health (green/yellow/red) and Circuit Breaker errors?

In Kibana 8.9.0, I managed to successfully create an alert for Cluster Health,
so that if cluster health transitions from green to yellow or red,
I receive an alert.

I did the following:

  1. Go to Stack Monitoring
  2. In the top right, click on Enter setup mode
  3. In the top right, click on Alerts and rules
  4. Go to kbn:/app/management/insightsAndAlerting/triggersActions/rules
  5. Search for Cluster health

This works great.

How can I create alerts if the Circuit breaker errors
which seem to through es_rejected_execution_exception errors?

Even though my cluster health was green, I recently encountered two types of circuit breaker errors that I wish I received alerts for:

  1. Circuit Breaker error 1:
failed to publish events: 429 Too Many Requests: {"error":{"root_cause":[{"type":"es_rejected_execution_exception",
"reason":"rejected execution of coordinating operation 
[coordinating_and_primary_bytes=216741495, replica_bytes=0, all_bytes=216741495, coordinating_operation_bytes=41925,
 max_coordinating_and_primary_bytes=214748364]"}],"type":"es_rejected_execution_exception","reason":"rejected execution of coordinating operation
[coordinating_and_primary_bytes=216741495, replica_bytes=0, all_bytes=216741495, coordinating_operation_bytes=41925,
max_coordinating_and_primary_bytes=214748364]"},"status":429}
  1. Circuit Breaker error 2:
failed to index document (es_rejected_execution_exception): rejected execution of
 TimedRunnable{original=org.elasticsearch.action.support.replication.TransportWriteAction$1/WrappedActionListener{org.elasticsearch.action.support.replication.ReplicationOperation$$Lambda$9217/0x00007f1ce96ca250@2fe4a9c4}
{org.elasticsearch.action.support.replication.ReplicationOperation$$Lambda$9218/0x00007f1ce96ca468@6d6c9beb},
 creationTimeNanos=775669912784792, startTimeNanos=0, finishTimeNanos=-1, failedOrRejected=false} on
 TaskExecutionTimeTrackingEsThreadPoolExecutor[name = quicknode-elastic-es-data-hot-zone-3-1/write,
queue capacity = 10000, task execution EWMA = 3.4ms, total task execution time = 58.2d,
 org.elasticsearch.common.util.concurrent.TaskExecutionTimeTrackingEsThreadPoolExecutor@2e4d252[Running, pool size = 8, active threads = 8, queued tasks = 10001, completed tasks = 639326371]]

Is there a pre-canned Alert that I can use to detect those types of errors?

If not, is there a general technique for creating alerts for Circuit Breaker errors?

As of 8.9.1 these are the rules available:

If you get those message logs ingested in your cluster properly, then you need to create custom rule based on an Elasticsearch query as documented here

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.