Hi,
I'm configuring curator to take snapshot, but failing to do so. ending up with a PARTIAL error.
our cluster is with 5 master nodes, 18 data nodes. sharing nfs data domain.
curator log is below
2020-10-12 22:20:20,606 DEBUG curator.cli process_action:101 Doing the action here.
2020-10-12 22:20:20,757 DEBUG curator.utils test_repo_fs:1317 All nodes can write to the repository
2020-10-12 22:20:20,758 DEBUG curator.utils test_repo_fs:1319 Nodes with verified repository access: {'T_Jr9vNiQcqngHYd3do0vg': {'name': ''}, '-KU_84pmSTicg6XwdBAOqA': {'name': ''}, 'BE0S1L7TRtCauzOcvBu0Gw': {'name': ''}, 'AxoLsUvZSh6Y18-KUXRJ0g': {'name': ''}, 'TIshnuDqS--6NwasR7-4qA': {'name': ''}, 'ce66QOSwRGOkFxOpCj4PPQ': {'name': ''}, 'UKjoAwjvRPeP6azOitn8Lg': {'name': ''}, 'OWflKxVhRcenQPmy-L2OYQ': {'name': ''}, 'ykpLrHxVS2GCNA7ZJW7wXQ': {'name': ''}, 'Vib2qLNgQL-1u-ZoB43b7g': {'name': ''}, 'DpuvnJ97S1KoFpo7Zn0o7A': {'name': ''}, 'KpDKIr0GQy6UuoBhtqOT9Q': {'name': ''}, 'MixlOSNiTrCv6b67SCiPeg': {'name': ''}, 'kb-mJzzFQi-p5our_29faA': {'name': ''}, 'McvCIKZESxK9WDnNgIqxWQ': {'name': ''}, 'DRnN-zXFS6CmHvW7UheEeg': {'name': ''}, 'mXKhcY3dQXK_NgqIt14VUw': {'name': '}, 'dU0KZoUaRtKKoe_pvnNhBg': {'name': ''}, '-H8dgbX-RveWCTKzXZsU5A': {'name': ''}, '5UXp5RGGTP-JSuwXvKHhxQ': {'name': ''}, '38IXdNjvSK6l9IG3WCh4iw': {'name': ''}, 'XCvgK56wRyymhr4cGHzgOQ': {'name': ''}, 'LD_wMnjzSg-lS-0QFG6BpA': {'name': ''}}
2020-10-12 22:20:20,763 INFO curator.actions.snapshot do_action:1701 Creating snapshot "curator-20201012222020" from indices: ['production-a3pi_assets-2016']
2020-10-12 22:20:20,810 DEBUG curator.utils wait_for_it:1808 Elapsed time: 0 seconds
2020-10-12 22:20:20,814 DEBUG curator.utils snapshot_check:1589 Snapshot state = IN_PROGRESS
2020-10-12 22:20:20,814 INFO curator.utils snapshot_check:1591 Snapshot curator-20201012222020 still in progress.
2020-10-12 22:20:20,814 DEBUG curator.utils wait_for_it:1811 Response: False
2020-10-12 22:20:20,814 DEBUG curator.utils wait_for_it:1831 Action "snapshot" not yet complete, 0 total seconds elapsed. Waiting 9 seconds before checking again.
2020-10-12 22:20:29,819 DEBUG curator.utils wait_for_it:1808 Elapsed time: 9 seconds
2020-10-12 22:20:29,830 DEBUG curator.utils snapshot_check:1589 Snapshot state = PARTIAL
2020-10-12 22:20:29,830 WARNING curator.utils snapshot_check:1598 Snapshot curator-20201012222020 completed with state PARTIAL.
2020-10-12 22:20:29,830 DEBUG curator.utils wait_for_it:1811 Response: True
2020-10-12 22:20:29,830 DEBUG curator.utils wait_for_it:1816 Action "snapshot" finished executing (may or may not have been successful)
2020-10-12 22:20:29,830 DEBUG curator.utils wait_for_it:1834 Result: True
2020-10-12 22:20:29,837 ERROR curator.actions.snapshot report_state:1678 Snapshot PARTIAL completed with state: PARTIAL
2020-10-12 22:20:29,837 ERROR curator.cli run:213 Failed to complete action: snapshot. <class 'curator.exceptions.FailedExecution'>: Exception encountered. Rerun with loglevel DEBUG and/or check Elasticsearch logs for more information. Exception: Snapshot PARTIAL completed with state: PARTIAL
the es master node log look like below
[2020-10-12T22:20:20,782][INFO ][o.e.s.SnapshotsService ] [] snapshot [logs_backup:curator-20201012222020/_-bVH1xFTK29bpMnkPh87g] started
[2020-10-12T22:20:21,051][INFO ][o.e.s.SnapshotsService ] [] snapshot [logs_backup:curator-20201012222020/_-bVH1xFTK29bpMnkPh87g] completed with state [PARTIAL]
es version: 7.7.1
os: centos 7
curator, version 5.8.1
anyone familiar with this scenario?
thanks in advance
Anil.