Hello,
I have been working on figuring out a process to restore some of a snapshot, start writing to it and allow our frontend to read from it. Then restore the rest of the snapshot.
I think I have a good process. However, I noticed that when I perform the second restore (The restore of all the historical data indexes.) None of the indexes that are part of a data stream are entered into their backing index list. We need the correct indexes in their associated data_stream backing indexes so Kibana can view historical data.
Reading [this documentation] I found this: (Restore a snapshot | Elasticsearch Guide [8.17] | Elastic)
You can restore only a specific backing index from a data stream. However, the restore operation doesn’t add the restored backing index to any existing data stream.
I could use this command to move a index over.
POST _data_stream/_modify
{
"actions": [
{
"add_backing_index": {
"data_stream": "data-stream-index-that-was-restored",
"index": ".ds-logs-2024.10.25-000247"
}
}
]
}
But this does not scale well. I need to reassign many indexes to different data streams. You can't provide a list as a value for index
. You can't provide wild cards. Lastly you can't add multiple add_backing_index
actions to the actions
list. So I am left with just doing one request at a time. I can make a script that I can feed in a list of indexes and their associated data_stream, then have that perform a single request for every index in a data_stream in an automated way. However it just feels wrong that I would need to make 100's of requests to make this modification. Especially for a snapshot restore.
Is there a way to restore a snapshot where the indexes of data streams are restored into backing indexes atomically? Maybe there is some special index in the snapshot that has this as an attribute?
Here is some context of how I restore:
- Snapshots are restored to different DR clusters, that are only used if there is an outage.
- The snapshots include
indices: *
- I first start the restore by finding the most recent indexes associated with data streams and static indexes, and I restore those first + cluster state. This gets me starting indexes, api keys, and dashboards.
- Then I open those indexes, and make sure data can be written to them and read from our frontend.
- Once the indexes that were restored are confirmed to be working. I close all other indexes associated with data streams + static indexes. I don't close any system indexes, just our data type indexes.
- I start the restore of all the historical data in these other indexes.
This works well as it allows us to write new data to the most current indexes, while our front end can read the data. Then the restore of all the historical data can be done in the background.
The one problem is none of the data stream indexes restored their backing index connections. So, when I go into Kibana I don't see any of the historical data. However, If I add it with the request above, then I see it.
Anyone recommend a way to restore these backing index connections properly so I don't need to send many requests to manual assign them?