Failed to start watcher. please wait for the cluster to become ready or try to start Watcher manully

Hi,
I have installed watcher plugin, trying to run elastic search but getting exception.

Please help me to resolve this. Its urgent.

Thanks,
Sahil

Hey,

please paste logs by text and not images and also check if there are other exceptions on startup. What happens when you actually start watcher manually as mentioned in the error message?

--Alex

D:\elasticsearch\elasticsearch-2.4.0\elasticsearch-2.4.0\bin>elasticsearch
[2016-09-28 13:45:21,569][INFO ][node ] [Hideko Takata] version[2.4.0], pid[3532], build[ce9f0c7/2016-08-29T09:14:17Z]
[2016-09-28 13:45:21,570][INFO ][node ] [Hideko Takata] initializing ...
[2016-09-28 13:45:22,051][INFO ][plugins ] [Hideko Takata] modules [lang-groovy, reindex, lang-expression], plugins [watcher, license], sites []
[2016-09-28 13:45:22,085][INFO ][env ] [Hideko Takata] using [1] data paths, mounts [[(D:)]], net usable_space [290.2gb], net total_space [319.2gb], spins
? [unknown], types [NTFS]
[2016-09-28 13:45:22,087][INFO ][env ] [Hideko Takata] heap size [910.5mb], compressed ordinary object pointers [true]
[2016-09-28 13:45:22,115][INFO ][watcher.trigger.schedule ] [Hideko Takata] using [ticker] schedule trigger engine
[2016-09-28 13:45:25,128][INFO ][node ] [Hideko Takata] initialized
[2016-09-28 13:45:25,128][INFO ][node ] [Hideko Takata] starting ...
[2016-09-28 13:45:25,481][INFO ][transport ] [Hideko Takata] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2016-09-28 13:45:25,484][INFO ][discovery ] [Hideko Takata] elasticsearch/wSWMrJhFR7qVKF4UzrtjaQ
[2016-09-28 13:45:29,520][INFO ][cluster.service ] [Hideko Takata] new_master {Hideko Takata}{wSWMrJhFR7qVKF4UzrtjaQ}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-jo
in(elected_as_master, [0] joins received)
[2016-09-28 13:45:29,878][INFO ][http ] [Hideko Takata] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2016-09-28 13:45:30,528][INFO ][license.plugin.core ] [Hideko Takata] license [536f535b-d32d-4d2c-aa05-e66e4c8f8adc] - valid
[2016-09-28 13:45:33,059][INFO ][node ] [Hideko Takata] started
[2016-09-28 13:45:33,061][ERROR][license.plugin.core ] [Hideko Takata]

License will expire on [Wednesday, October 26, 2016]. If you have a new license, please update it.

Otherwise, please reach out to your support contact.

Commercial plugins operate with reduced functionality on license expiration:

- watcher

- PUT / GET watch APIs are disabled, DELETE watch API continues to work

- Watches execute and write to the history

- The actions of the watches don't execute

[2016-09-28 13:45:40,909][INFO ][gateway ] [Hideko Takata] recovered [174] indices into cluster_state
[2016-09-28 13:45:44,601][INFO ][watcher ] [Hideko Takata] starting watch service...
[2016-09-28 13:46:44,652][WARN ][watcher ] [Hideko Takata] failed to start watcher. please wait for the cluster to become ready or try to start Watcher manual
ly
ElasticsearchTimeoutException[Timeout waiting for task.]
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:70)
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:62)
at org.elasticsearch.watcher.support.init.proxy.ClientProxy.index(ClientProxy.java:86)
at org.elasticsearch.watcher.history.HistoryStore.forcePut(HistoryStore.java:115)
at org.elasticsearch.watcher.execution.ExecutionService.executeTriggeredWatches(ExecutionService.java:404)
at org.elasticsearch.watcher.execution.ExecutionService.start(ExecutionService.java:99)
at org.elasticsearch.watcher.WatcherService.start(WatcherService.java:82)
at org.elasticsearch.watcher.WatcherLifeCycleService.start(WatcherLifeCycleService.java:100)
at org.elasticsearch.watcher.WatcherLifeCycleService.access$100(WatcherLifeCycleService.java:36)
at org.elasticsearch.watcher.WatcherLifeCycleService$3.run(WatcherLifeCycleService.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2016-09-28 13:46:44,705][INFO ][watcher ] [Hideko Takata] starting watch service...
[2016-09-28 13:47:44,730][WARN ][watcher ] [Hideko Takata] failed to start watcher. please wait for the cluster to become ready or try to start Watcher manual
ly

Hey,

my current assumption is, that the watch history index has not recovered. You can use the recovery API to find out and use watcher stats to find out if watcher is started?

Also, did you start watcher manually as suggested in the logs and in my last post?

--Alex

Hi Alex,

I deleted the watch indexes and restart elastic search again and it worked.
Here I am trying to send an alert using gmail but i am getting exception for that as well (failed to update the watch record and connection refused). Check below

    ... 13 more

[2016-09-28 14:52:02,819][ERROR][watcher.execution ] [Vavavoom] failed to update watch record [event_critical_watch_25-2016-09-28T09:20:32.141Z]
ElasticsearchTimeoutException[Timeout waiting for task.]
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:70)
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:62)
at org.elasticsearch.watcher.support.init.proxy.ClientProxy.index(ClientProxy.java:86)
at org.elasticsearch.watcher.history.HistoryStore.put(HistoryStore.java:93)
at org.elasticsearch.watcher.execution.ExecutionService.execute(ExecutionService.java:302)
at org.elasticsearch.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:438)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[2016-09-28 14:52:03,919][ERROR][watcher.actions.email ] [Vavavoom] failed to execute action [event_critical_watch/email_admin]
javax.mail.MessagingException: failed to send email with subject [event_critical_watch executed] via account [gmail];
nested exception is:
com.sun.mail.util.MailConnectException: Couldn't connect to host, port: smtp.gmail.com, 587; timeout 120000;
** nested exception is:**
** java.net.ConnectException: Connection refused: connect**
at org.elasticsearch.watcher.actions.email.service.InternalEmailService.send(InternalEmailService.java:85)
at org.elasticsearch.watcher.actions.email.service.InternalEmailService.send(InternalEmailService.java:77)
at org.elasticsearch.watcher.actions.email.ExecutableEmailAction.execute(ExecutableEmailAction.java:84)
at org.elasticsearch.watcher.actions.ActionWrapper.execute(ActionWrapper.java:106)
at org.elasticsearch.watcher.execution.ExecutionService.executeInner(ExecutionService.java:388)
at org.elasticsearch.watcher.execution.ExecutionService.execute(ExecutionService.java:273)
at org.elasticsearch.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:438)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.sun.mail.util.MailConnectException: Couldn't connect to host, port: smtp.gmail.com, 587; timeout 120000;
nested exception is:
java.net.ConnectException: Connection refused: connect
at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:2054)
at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:697)
at javax.mail.Service.connect(Service.java:364)
at org.elasticsearch.watcher.actions.email.service.Account.send(Account.java:124)
at org.elasticsearch.watcher.actions.email.service.InternalEmailService.send(InternalEmailService.java:83)
... 9 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)

This is my SMTP configuration in elasticsearch.yml

watcher.actions.email.service.account:
gmail:
profile: gmail
smtp:
auth: true
starttls.enable: true
host: smtp.gmail.com
port: 587
user: *********@gmail.com
password: ************

Hey,

it does not look to me as if your cluster recovered fully and back to a good state, as it cannot write into the watch history index. Did you run the cat API I mentioned in a previous blog post?

On top of that, it seems you cannot reach gmail from your network.

--Alex

Running cat API means this (http://localhost:9200/_cat/indices?v ) right ? After running this I got the the health status of every index as yellow .

A small snapshot of that is here.

Is this correct or shall i delete all the indices and start fresh ..

Also please share the link to previous blog post where you have mentioned about this all. I couldnt find it.

Please let me know .

Thanks.

I was referring to this very thread, where I mentioned which cat API output would help here - so please see my above replies.

Also, please refrain from using screenshots, but paste the text output and paste the full output and not just parts. Otherwise it is so much harder to help.

--Alex