Watcher Version Error

alerting

(piyush) #1

I am trying to test sending an email from watcher (cluster status) but getting this error:

[.watches][[.watches][0]] VersionConflictEngineException[[watch][cluster_health_watch]: version conflict, current [358], provided [357]]
at org.elasticsearch.index.engine.Engine.getFromSearcher(Engine.java:263)
at org.elasticsearch.index.engine.InternalEngine.get(InternalEngine.java:349)
at org.elasticsearch.index.shard.IndexShard.get(IndexShard.java:615)
at org.elasticsearch.index.get.ShardGetService.innerGet(ShardGetService.java:173)
at org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:86)
at org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:76)
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:170)
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:164)
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:65)
at org.elasticsearch.action.support.single.instance.TransportInstanceSingleOperationAction$ShardTransportHandler.messageReceived(TransportInstanceSingleOperationAction.java:249)
at org.elasticsearch.action.support.single.instance.TransportInstanceSingleOperationAction$ShardTransportHandler.messageReceived(TransportInstanceSingleOperationAction.java:245)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:350)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

Email:
*Cluster status is yellow

curl -XPUT 'http://localhost:9200/_watcher/watch/cluster_health_watch' -d '{
"trigger" : {
"schedule" : { "interval" : "10s" }
},
"input" : {
"http" : {
"request" : {
"host" : "localhost",
"port" : 9200,
"path" : "/_cluster/health"
}
}
},
"condition" : {
"compare" : {
"ctx.payload.status" : { "eq" : "yellow" }
}
},
"actions" : {
"send_email" : {
"email" : {
"to" : "abc@test.com",
"subject" : "Cluster Status Warning",
"body" : "Cluster status is Yellow..."
}
}
}
}'


(Alexander Reelsen) #2

Hey,

so what happens here, is that each Watch stores itself after it has run. In order to store versioning is used. What happened here specifically was the fact, that a watch was loaded together with its version information, then executed and then this version was specified to store again. However this failed, because another process already did exactly the same. So somehow you triggered the same watch twice in your cluster.

Is it possible that you triggered this watch manually using the execute API while it was running or that you had an unstable cluster and switched master nodes shortly before this happened?

--Alex


(system) #3