Elasticsearch Curator restore only if there are changes

Hello,

Since restoring the .kibana index requires closing the .kibana index, restoring the .kibana index and then reopening the .kibana index, thus creating downtime in Kibana, I would like Curator to restore the .kibana index only if there are changes (new saved objects, etc.) to reduce the total downtime.

Is there a good way to accomplish this?

No. There are no real ways to check for updates in the manner you are specifying. However…

The .kibana "index" is actually an alias now. You are far better off restoring to a .kibana-restore_suffix and changing the .kibana alias to point to that than to restore over the top of an existing .kibana. After restoring a .kibana index, you could then check for changes on a per-document basis, then determine if an update was made (you would probably need to script something here), and then re-link the alias if deemed necessary.

Ah, yes. I was referring to the .kibana_n index (in my case .kibana_1).

Thanks for the reply, @theuntergeek but what you described here isn't achievable via Curator, is it? You did not design Curator to run arbitrary code (i.e. some script) so I am unsure where this script will live.

I currently have Curator (a k8s cronjob running every 15 minutes) restore from an S3 bucket (that is snapshot from another k8s cluster) so if I can't do this via Curator exclusively, I was thinking about creating a custom Docker image that will run a script (instead of the curator cli command as ENTRYPOINT) that will 1) determine if a restore is necessary somehow (via some kind of a diff or last updated timestamp) and 2) run Curator cli to restore only if there is an update.

Not really sure how to go about step 1. Any advice would be appreciated. Thanks.

Please, let's back up a bit here and ask why you need to restore every 15 minutes? What's the use case?

Let's just say for n number of Kibana instances in n number of k8s clusters, only 1 Kibana instance from 1 cluster is the source of truth for all saved objects and I would like to snapshot/restore (in a timely manner) every 15 minutes to keep all instances in sync but restoring means downtime so I am looking for ways to reduce the overall downtime by not running restore if there is no delta.

Are you load-balancing (or failing over) between multiple Elasticsearch clusters within the same K8s environment? I am still trying to wrap my brain around the need to keep "n number of Kibana instances in sync between n k8s clusters," such that "only 1 Kibana instance from 1 cluster is the source of truth." I have never seen this architectural approach and want to fully understand it.

Snapshot restore is not the best way to keep things in sync between clusters. The proper feature for keeping indices in sync between clusters already exists as a premium feature called cross-cluster replication. Any other approach will be fraught with potential hiccups and gotchas, and will also be quite manual—you'd have to create and maintain it yourself.

Sorry for the confusion. To keep things simple. Let's just say there is a dev instance and a prod instance of Elasticsearch/Kibana running and I would like to make changes to a Kibana dashboard in dev and have it reflected in production Kibana.

That cross-cluster replications looks great but we are not a paying customer. We are also running version 7.6.2.

I would not try to keep instances identical between clusters in this way (even if it can be compelled to work), for a few reasons. Kibana visualizations, saved searches, dashboards, et. all are tied to the Index Patterns. In pursuing things in this way, you are forced to keep Index Patterns synced and identical between environments, even if it does not make sense to do so. If you ever need to delete and re-create an Index Pattern, it will have a different UUID/Hash/Name, which will force you to re-connect (or worse, recreate) all of the visualizations to the updated Index Pattern ID.

Using Kibana API calls, it is possible to get these Index Pattern names, and upload/update visualization JSON structures as you would with any regular document. This is how the Beats upload Kibana dashboards and create/update Index Patterns. In the event that a change is made on one, it should be possible to check saved/stored documents in Kibana with API calls and simply push/merge them to the other clusters.

If you cannot use cross-cluster replication (which was made available in Elasticsearch 6.6, by the way), then I highly recommend using API calls to update visualizations. If that means storing the JSON dumps in a git repo, doing diffs, and then pushing out changes to other clusters, then so be it. It's a better approach, and one much less likely to cause breakage (which would surely happen with ad-hoc snapshot restores, since closing the index is required).

2 Likes

Makes sense. Thanks for the advice. I'll try the Kibana API.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.