Elasticsearch Snapshot

Im using Snapshot Lifecycle Mgmt to create nightly snapshots. My questions are

  1. I saw this comes under Basic license. But I got confused when I saw XPACK tag in SLM page. Please clarify

  2. Since this will be daily snapshot, if I restore most recent snapshot,
    will that be enough?

  3. how to get most recent snapshot to do restore?

  4. How to delete old snapshots(deletion with wildcard is not working in 7.4.2)? Which ones are safe to delete, as its incremental?

xpack is a set of modules available in the default distributions.
xpack comes by default with a basic license (no cost).
Some of the xpack features requires a gold or platinum licence. You can check that at

Yes.

You need to understand that although elasticsearch is smart enough to do incremental backups, a snapshot should always be considered as a full backup. So restore anytime the full backup you want to restore.
The fact it's incremental behind the scene should not be a concern for you.

Thanks for the reply. I was trying to restore with repo name and snapshot name
curl -XPOST "localhost:9200/_snapshot/{REPO_NAME}/{SNAPSHOT_NAME}/_restore?wait_for_completion=true"

Since Im giving snapshot name, Im struggling to find recent snapshot every time.

Im running this command curl localhost:9200/_cat/snapshots/as_repo?v to get list of snapshots and find recent one based on time stamp

Is there a way?

If you defined a SLM policy as the guide said:

PUT /_slm/policy/nightly-snapshots
{
  "schedule": "0 30 1 * * ?", 
  "name": "<nightly-snap-{now/d}>", 
  "repository": "my_repository", 
  "config": { 
    "indices": ["*"] 
  },
  "retention": { 
    "expire_after": "30d", 
    "min_count": 5, 
    "max_count": 50 
  }
}

It should be "easy" to know which snapshot is the most recent one, based on the name.

And if you run:

GET /_slm/policy/nightly-snapshots?human

This will give you what is the latest successful snapshot:

{
  "nightly-snapshots" : {
    "version": 1,
    "modified_date": "2019-04-23T01:30:00.000Z",
    "modified_date_millis": 1556048137314,
    "policy" : {
      "schedule": "0 30 1 * * ?",
      "name": "<nightly-snap-{now/d}>",
      "repository": "my_repository",
      "config": {
        "indices": ["*"],
      },
      "retention": {
        "expire_after": "30d",
        "min_count": 5,
        "max_count": 50
      }
    },
    "last_success": {                                                    
      "snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", 
      "time_string": "2019-04-24T16:43:49.316Z",
      "time": 1556124229316
    } ,
    "last_failure": {                                                    
      "snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
      "time_string": "2019-04-02T01:30:00.000Z",
      "time": 1556042030000,
      "details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
    } ,
    "next_execution": "2019-04-24T01:30:00.000Z",                        
    "next_execution_millis": 1556048160000
  }
}

You just have to read the value of nightly-snapshots.last_success.snapshot_name.

Many thanks. How this will work when I restore in different cluster? Should I schedule SLM policy in that cluster also?

curl -XGET ...:9200/_slm/policy/nightly-snapshots?pretty
{
"nightly-snapshots" : {
"version" : 4,
"modified_date_millis" : 1605014014404,
"policy" : {
"name" : "<nightly-snap-{now/d}>",
"schedule" : "0 30 1 * * ?",
"repository" : "as_repo",
"config" : {
"indices" : [
"*"
]
}
},
"last_success" : {
"snapshot_name" : "nightly-snap-2020.11.10-mnctreehqx6f23narysogq",
"time" : 1605015175451
},
"next_execution_millis" : 1605058200000
}

As you see , one snapshot is available. Differentcluster using same NFS not able to find this slm_policy,

curl -XGET localhost:9200/_slm/policy/nightly-snapshots?pretty
{
"error" : {
"root_cause" : [
{
"type" : "resource_not_found_exception",
"reason" : "snapshot lifecycle policy or policies [nightly-snapshots] not found"
}
],
"type" : "resource_not_found_exception",
"reason" : "snapshot lifecycle policy or policies [nightly-snapshots] not found"
},
"status" : 404
}

But Its able to able find that snapshot. Again in this case same issue, this query is not running in different cluster

curl -XGET localhost:9200/_slm/policy/nightly-snapshots?pretty

Hence how to get the recent snapshot name? Am I missing something ??

Please don't post unformatted code, logs, or configuration as it's very hard to read.

Instead, paste the text and format it with </> icon or pairs of triple backticks (```), and check the preview window to make sure it's properly formatted before posting it. This makes it more likely that your question will receive a useful answer.

How this will work when I restore in different cluster?

You can call the API from anywhere.

# To be executed in cluster1
GET /_slm/policy/nightly-snapshots?human

In the second cluster, you just need to create the repository which points to the same repository you created in cluster1.

# To be executed in cluster2
PUT /_snapshot/my_repository
{
  "type": "fs",
  "settings": {
    "location": "my_backup_location"
  }
}

And use a "normal" restore in cluster2.

# To be executed in cluster2
POST /_snapshot/my_repository/LATEST_SNAPSHOT_NAME/_restore

Sorry for not giving proper question. In case , cluster 1 is down and unreachable, I will not be able to execute
GET /_slm/policy/nightly-snapshots?human

curl -XGET 164.99.185.202:9200/_slm/policy/nightly-snapshots?pretty
{
  "error" : {
    "root_cause" : [
      {
        "type" : "master_not_discovered_exception",
        "reason" : null
      }
    ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503

Correct?
My main requirement is to restore when cluster1 goes down completely with no master eligible nodes available. Then restore latest snapshot to cluster2. How to get latest snapshot in that case?

That's true.

You can check the snapshot name as I mentioned earlier in that case?

Are you referring
POST /_snapshot/my_repository/LATEST_SNAPSHOT_NAME/_restore ???
Means should I get LATEST_SNAPSHOT_NAME from cluster1 using slm command prior and use it in cluster 2 when required??

No. You can list from cluster2 all the snapshots from a repository with:

Thanks.. Its working fine. So with below SLM policy

PUT /_slm/policy/nightly-snapshots
{
  "schedule": "0 30 1 * * ?", 
  "name": "<nightly-snap-{now/d}>", 
  "repository": "my_repository", 
  "config": { 
    "indices": ["*"] 
  },
  "retention": { 
    "expire_after": "30d", 
    "min_count": 5, 
    "max_count": 50 
  }
} 

1)should I worry about deleting old snapshots peroidically? or retention policy will take care of that?
2) I tried to set expire_after to 1m and max_count to 5, when I create more than 5 snapshots, old ones are not getting deleted. Im able see that. I saw similar issue reported
Snapshot Retention Task not deleting expired snapshots. Im using 7.4.2. is it a bug?

  1. I guess it will be done automatically. (I never used this feature)
  2. It was fixed from 7.5.2. So upgrade to 7.9.3 :slight_smile:

Means its possible that , this feature may not work as expected :frowning: In that case, I can delete all snapshots simply after 30 days,correct? Only impact in that will be next day snapshot job will take more time ,correct?

Correct.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.