Specify timeout per watch

I have a watch that runs infrequently and is schedule during low cluster activity. Even running during low activity the watch query sometimes requires more than the default timeout of 30s to execute. I don't want to change the default since other watches running more frequently shouldn't require more than the timeout. Is it possible to specify within the watch definition of a query timeout is longer than the default. I'm using 2.1.2 but looking to upgrade to 5.4.x

Thanks

can you please be more specific about which timeout you are talking here in particular?

"execution_duration" : 30035,
"input" : {
"type" : "search",
"status" : "failure",
"reason" : "ElasticsearchTimeoutException[Timeout waiting for task.]",

[cid:image003.png@01D27AE2.AAB80570]
Curtis Nielson | Software Architect
3M Health Information Systems Div
3M Silver Spring, 12215 Plum Orchard Drive | Silver Spring, MD 20904
Office: 301 281 8352 Mobile: 301 675 3573
cnielson@mmm.commailto:cnielson@mmm.com | www.3M.comhttp://www.3m.com/

The information contained in this message and any attachments is intended only for the use of the individual or entity to which it is addressed, and may contain information that is PRIVILEGED, CONFIDENTIAL, and exempt from disclosure under applicable law. If you have received this message in error, you are prohibited from copying, distributing, or using the information. Please contact the sender immediately by return e-mail and delete the original message from your system.

Hey,

can you provide the full watch? Also, is this a cluster which is busy with other searches? Is the cluster health at least yellow, when the watch executes? Can you also provide the full watch history entry?

Best would be to upload them in a gist and link from here, as you cannot upload too big documents here or it requires special formatting.

Thanks!

--Alex

The cluster itself is all green – but very busy. I was able to reduce the timeframe of the watch from 30 days to 24 hours and the watch was successful.

Here is a portion of the output email – you can see the large numbers of log entries over just 24 hours
Cloud service usage Report PROD Environment
EPRS - Total Requests 33213529
Calling Service Total Requests Max (ms) Avg (ms) Total Elapsed Time
ABE 23741667 370178.0 169.80736584335042 4.031509934E9
NLP 9240170 127267.0 227.3885755348657 2.101109094E9
170240 32802.0 20.4757929981203 3485799.0
CopyPasteEngine 61452 55797.0 1434.2489259910174 8.8137465E7
EPRSService - Total Requests 31645548
Calling Service Total Requests Max (ms) Avg (ms) Total Elapsed Time
ABE 22898992 32917.0 20.50080047191597 4.69447666E8
NLP 8708422 37128.0 12.891938057204852 1.12268437E8
CopyPasteEngine 38134 10991.0 7.109141448576074 271100.0
HIPAAService - Total Requests 24309818
Calling Service Total Requests Max (ms) Avg (ms) Total Elapsed Time
ABE 24268895 34826.0 76.750084006709 1.86263973E9
DataUpload 17685 64023.0 73.22476675148431 1294980.0
Talend_KWB 6712 3105.0 182.83700834326578 1227202.0
OperationalAnalyticsHub 5901 3064.0 166.75936281986105 984047.0
NLP 3167 1996.0 142.93684875276287 452681.0
GPCSHipaaLogWriter 2516 3001.0 96.21661367249602 242081.0
1721 3302.0 110.17606042998257 189613.0
SearchService 1272 1513.0 148.68867924528303 189132.0
CoreSTS 1225 816.0 97.43510204081633 119358.0
EPRS 715 2130.0 178.83636363636364 127868.0
AdminConsole 4 109.0 79.0 316.0
NormsFactory 3 204.0 200.0 600.0
ApplicationManagementServices 2 161.0 124.5 249.0

And here is the watch itself.
{
"trigger":{
"schedule":{
"weekly":{
"on":"monday",
"at":"07:00"
}
}
},
"metadata":{
"timespan":"now-24h",
"email_subject":"Cloud service usage report TEST Environment",
"email_body_header":"Cloud service usage Report TEST Environment"
},
"input":{
"search":{
"request":{
"indices":[
"performance_search"
],
"body":{
"size":0,
"fields":[
"LoggingProcess.Name",
"InvokingProcess.Name",
"RootProcess.Name",
"ElapsedTime"
],
"query":{
"filtered":{
"filter":{
"terms":{
"Tags":[
"msm",
"tomcatmsm",
"cachehit"
]
}
},
"query":{
"bool":{
"must":[
{
"range":{
"Timestamp":{
"gte":"{{ctx.metadata.timespan}}"
}
}
}
]
}
}
}
},
"aggs":{
"distinct_client":{
"terms":{
"size" : 100,
"missing" : "",
"field":"LoggingProcess.Name"
},
"aggs":{
"distinct_service":{
"terms":{
"size" : 100,
"missing" : "",
"field":"InvokingProcess.Name"
},
"aggs":{
"elapsed_time_stats":{
"stats":{
"field":"ElapsedTime"
}
}
}
}
}
}
}
}
}
}
},
"condition":{
"always":{

  }

},
"actions":{
"email_report":{
"email":{
"to":[
"memyemail@email.com"
],
"subject":"{{ctx.metadata.email_subject}}",
"body":{
"html":"table, th, td {border: 1px solid black; border-collapse: collapse;} th, td {padding: 5px;}

{{ctx.metadata.email_body_header}}

{{#ctx.payload.aggregations.distinct_client.buckets}}

{{key}} - Total Requests {{doc_count}}

<table border="1" style="width:100%"> Calling Service Total Requests Max (ms) Avg (ms) Total Elapsed Time {{#distinct_service.buckets}} {{key}} {{doc_count}} {{elapsed_time_stats.max}} {{elapsed_time_stats.avg}} {{elapsed_time_stats.sum}} {{/distinct_service.buckets}}{{/ctx.payload.aggregations.distinct_client.buckets}}

Executed watch [{{ctx.watch_id}}] at [{{ctx.execution_time}}]

Do not reply to this email. The sending account is not monitored."
}
}
}
}
}

hey,

have you tried the query in standalone mode without the watch? Is it possible that you are querying too many shards and you cluster does not have enough resources to sustain the query?

--Alex

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.