Remote reindex with wildcard to multiple indices

lduvnjak · November 9, 2022, 3:04pm

Hey Everyone,

I'm having some issues with running a remote reindex from cluster a to cluster b. The end goal I'm trying to achieve is having some index pattern that the user can provide, and the remote reindex will get all the indices with that pattern from a remote cluster, and create them on the local cluster.

This should include having op_type set to "create" as it should be used to sync the clusters in case of some failure. I tried taking the script provided by elastic here: Reindex API | Elasticsearch Guide [8.5] | Elastic , and just removing the part where he adds a minus (-), but that ends up failing.

This is what the script looks like currently:

POST _reindex
{
  "source": {
    "remote": {
      "host": "https://remote_host:9200",
      "username": "elastic",
      "password": "SomePass"
    },
    "index": "source-*"
  },
  "dest": {
    "index": "source",
    "op_type": "create"
  },
  "script": {
    "lang": "painless",
    "source": "ctx._index = 'source-' + (ctx._index.substring('source-'.length(), ctx._index.length()))"
  }
}

One more thing that needs to be said, I'm not sure if creating missing indices on the destination will break ILM. Do the reindexed indices still contain the same metadata ILM uses to tell the indices apart... their age, their order, and so on?

I'm aware some people created scripts for this in bash and so on, but I would primarily like to know if there's some way to overcome this with Elastic and it's API purely. If not, I can write the logic used in the bash scripts myself in Ansible.

Thanks in advance for any help!

warkolm · November 9, 2022, 10:55pm

Failing how? It helps if you share the response from Elasticsearch, and any relevant logs.

Nope.

lduvnjak · November 9, 2022, 11:10pm

Sorry for not giving an example. There is no actual error, what ends up happening is all the data from the source indices get written to one destination index. In this example it would be "source".

Only when I make the index name different from the source do they get replicated semi-correctly. For example, by adding a minus at the end, or any other string. By semi-correctly I mean the data is correct, but of course the names of the indices are different, which is not acceptable in this situation.

Regarding the second question, is there any way to make ILM work in this situation as on the source cluster? If it's not possible in the basic license, would it be possible using CCR?

Thanks!

warkolm · November 10, 2022, 6:06am

Right, cause that's what you told it to do

There's not currently an easy way to index multiple source indices into multiple destination ones, it's a DIY process using a for loop or something in some external code.

If you reindexed into a write alias it will use the ILM policy, but it will treat the data as new and not factor in existing ILM settings on those indices.

CCR might work though.

Christian_Dahlqvist · November 10, 2022, 6:18am

Have you considered using the snapshot and restore APIs? These retain the index settings, but do copy the indices axactly as they are and do not allow you to change mappings, which you can do when reindexing.

lduvnjak · November 10, 2022, 7:20am

Yeah I figured snapshot and restore would be my next best bet. Just wanted to confirm if there's a simpler way before doing that.

The potential issue with a snapshot and restore would be the time it takes to complete. The cluster is quite large with over 2tb ingested daily. I'll see if I can make it work. Thank you for the help Mark, and Christian.

Cheers!

Christian_Dahlqvist · November 10, 2022, 7:22am

Using snapshot and restore will likely be faster and require less resources than reindexing.

system · December 8, 2022, 7:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Remote reindex with wildcard and script not working Elasticsearch	1	897	May 17, 2017
Reindex multiple indices from remote with the same name Elasticsearch	5	6022	April 8, 2019
What's the recommended strategy to reindex all (over 500) indices from 5.6.3 to 7.x? Elasticsearch	14	6156	August 12, 2019
Reindexing daily indices and maintaining the same index name Elasticsearch	4	4533	March 2, 2017
Is reindex from remote included in Python client Elasticsearch	8	574	November 9, 2023

Remote reindex with wildcard to multiple indices

Related topics