Watcher not starting on hosted cluster

jakehschwartz · January 4, 2017, 10:15pm

Hello all,

I have a cluster hosted on elastic cloud and I can index documents to the cluster, but when I try to create or delete watches, I receive the following error

{
   "error": "ElasticsearchIllegalStateException[not started]",
   "status": 500
}

Because this is a hosted cluster, I don't have logs or any sort of useful information. /_cat/indices shows the watcher indexes and /_cat/plugins shows the watcher plugin is installed. What else can I do?

spinscale · January 4, 2017, 10:29pm

Hey,

Out of curiosity, I assume this an hosted cluster by Elastic Cloud?

Can you manually start watcher using the start API and paste what is being returned?

--Alex

jakehschwartz · January 4, 2017, 11:10pm

Yes, sorry. Hosted by Elastic Cloud.

Both _watcher/_start and _watcher/_restart return

{
   "acknowledged": true
}

But my attempts to post a watch still return

{
    "error": "RemoteTransportException[[tiebreaker-0000000023][inet[/REDACTED-IP]]  [cluster:admin/watcher/watch/put]]; nested: ElasticsearchIllegalStateException[not started]; ",
    "status": 500
}

spinscale · January 5, 2017, 10:01am

Hey,

interesting. Let's try and debug this further.

You do have access to the logs in cloud by checking the Logs tab in cloud. Can you search for watcher and or maybe just paste all the entries that occur when you try to start it?
Is it possible, that you lost some shards of the watcher related indices? Can you run

GET _cat/shards/.w*
GET _cat/shards/.t*

and show the results?

--Alex

jakehschwartz · January 5, 2017, 5:08pm

I can't believe I missed the logs tab

[2017-01-05 16:45:49,351][WARN ][watcher ] [tiebreaker-0000000023] failed to start watcher. please wait for the cluster to become ready or try to start Watcher manually org.elasticsearch.index.engine.DocumentAlreadyExistsException: [.watch_history-2017.01.04][0] [watch_record][danger-room-ab9c3479-5926-4686-8375-64d8b5075780_12-2017-01-04T00:00:00.096Z]: document already exists at org.elasticsearch.index.engine.InternalEngine.innerCreateNoLock(InternalEngine.java:329) at org.elasticsearch.index.engine.InternalEngine.innerCreate(InternalEngine.java:287) at org.elasticsearch.index.engine.InternalEngine.create(InternalEngine.java:259) at org.elasticsearch.index.shard.IndexShard.create(IndexShard.java:482) at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:206) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574) at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

I deleted the .watch_history* indicies and have attempted to restart it again but it looks like it hung with nothing in the logs other than "[INFO ][watcher ] starting watch service..."

jakehschwartz · January 5, 2017, 5:22pm

Even after a full cluster restart, nothing is happening in the logs. All of my shards are in a STARTED state as well.

jakehschwartz · January 5, 2017, 5:24pm

Looks like it just took a long time to start. Thanks for your help

17:22:42	INFO	watcher	[2017-01-05 17:22:42,604][INFO ][watcher ] watch service has started
17:06:37	INFO	watcher	[2017-01-05 17:06:37,444][INFO ][watcher ] starting watch service...

spinscale · January 9, 2017, 9:23am

Hey,

wow, thats a lot of time for starting up! I guess it is too late now, but was any one of those indices (especially the .triggered-watches one) containing a lot of documents?

--Alex

jakehschwartz · January 9, 2017, 4:55pm

I removed all the watches before hand, so unless trigged-watches was full of deleted documents or something that caused it to be slow.

system · February 6, 2017, 4:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
On watcher installation, watcher_state status is stopped Elasticsearch elastic-stack-alerting	8	1846	July 6, 2017
Failed to start watcher. please wait for the cluster to become ready or try to start Watcher manully Elasticsearch elastic-stack-alerting	8	3988	July 6, 2017
Watcher not working (status code : 500) Elasticsearch elastic-stack-alerting	2	991	January 24, 2017
Can't create a new Watch Elasticsearch elastic-stack-alerting	2	1015	July 6, 2017
Error starting watcher Elasticsearch	2	193	May 15, 2023

Watcher not starting on hosted cluster

Related topics