Всем привет.
Столкнулся с такой проблемой: поставил пару задач в куратор, и один из индексом при выполнении forcemerge началу удаляться. Не могу понять, отчего это происходит.
Задача была такая: сначала сделать forcemerge индексов старше 5 дней, а затем их allocation на холодную ноду. В curator'е создал таски:
---
actions:
1:
action: forcemerge
description: "forceMerge all logs"
options:
max_num_segments: 1
delay: 1
timeout_override:
continue_if_exception: True
disable_action: False
filters:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 5
field:
stats_result:
epoch:
exclude: False
2:
action: allocation
description: "Apply shard allocation filtering rules by age"
options:
key: box_type
value: cold
wait_for_completion: True
continue_if_exception: False
disable_action: false
filters:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 10
field:
stats_result:
epoch:
exclude: False
В логах по задаче 1:
2018-07-28 01:00:02,979 INFO Preparing Action ID: 1, "forcemerge"
2018-07-28 01:00:02,998 INFO Trying Action ID: 1, "forcemerge": forceMerge all logs
2018-07-28 01:04:31,349 INFO forceMerging index prod-kub-2018.07.22 to 1 segments per shard. Please wait...
2018-07-28 01:04:31,349 INFO forceMerging index prod-kub-2018.07.22 to 1 segments per shard. Please wait...
....
2018-07-28 01:08:12,350 INFO Pausing for 1.0 seconds before continuing...
2018-07-28 01:08:13,352 INFO Action ID: 1, "forcemerge" completed.
По задаче 2:
2018-07-28 01:08:13,353 INFO Preparing Action ID: 2, "allocation"
2018-07-28 01:08:13,353 INFO Preparing Action ID: 2, "allocation"
2018-07-28 01:08:13,360 INFO Trying Action ID: 2, "allocation": Apply shard allocation filtering rules by age
2018-07-28 01:08:13,400 INFO Trying Action ID: 2, "allocation": Apply shard allocation filtering rules by age
2018-07-28 01:08:14,744 INFO Updating index setting {'index.routing.allocation.require.box_type': 'cold'}
2018-07-28 01:08:14,750 INFO Updating index setting {'index.routing.allocation.require.box_type': 'cold'}
2018-07-28 01:08:44,777 ERROR Failed to complete action: allocation. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'SERVER1', port=9200): Read timed out. (read timeout=30))
2018-07-28 01:08:44,782 ERROR Failed to complete action: allocation. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'SERVER2', port=9200): Read timed out. (read timeout=30))
Как итог - удаляется индексы за 5 дней, лишь один индекс, остальные все на месте. И не понятно, из-за какой задачи это происходит..