Reindexing daily indices and maintaining the same index name


(Casie Owen) #1

Hi,

We're using reindex from remote in order to pull indices from a 1.7.2 cluster to a 5.1.2 cluster. We combined the guidance on how to use painless for this scenario and the reindex from remote option from this article: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html to generate the script below. It works; it does exactly what the documentation says it will do: All the new indices are named what they were in the source cluster, except there is a "-1" appended. Thing is, we don't want the name changed at all. How do we do that?

Here is what we tried first (it's our application of the example in the article above)

POST reindex
{
"source": {
"remote": {
"host": "http://xx.xx.xx.xx:9200"
},
"index": "onlineusers
*"
},
"dest": {
"index": "onlineusers_"
},
"script": {
"lang": "painless",
"inline": "ctx.index = 'onlineusers' + (ctx.index.substring('onlineusers'.length(), ctx._index.length()))+ '-1'"
"
}
}

We messed around a bit with the ctx script to try to get it to not append anything. For example, we tried:
"inline": "ctx.index = 'onlineusers' + (ctx.index.substring('onlineusers'.length(), ctx._index.length()))"

but when we do that, it just puts all the documents from the daily indices into one index named onlineusers)_.

Worst case scenario, I can use the script as provided, that appends the -1, then reindex again locally with this:
"inline": "ctx.index = 'twittervolumeitems' + (ctx.index.substring('twittervolumeitems'.length(), ctx._index.length()-2))"
to remove the "-1" from the end. Tested that and it works, but it's double the work, and it seems we should be able to use the initial script to maintain the same index name.

Thanks,
Casie


(Nik Everett) #2

I just had this conversation with another engineer at Elastic. It isn't possible right now and it'd be fairly nasty to fix in Elasticsearch. Personally, if I had a bunch of daily indices I'd want to do them one at a time so I can see progress, maybe parallelize them, maybe blow one away and start it over if I find a mistake, etc. I'd write a bash script that used the _cat/indices API.

So with the fix being kind of painful and the work around being a thing I'd personally do anyway I'm somewhat inclined not to fix the issue. Though, I should probably file an issue so we at least have a record of all this and we can talk about it.


(Nik Everett) #3

I filed:


(Casie Owen) #4

Thanks. I think we're going to go with reindexing using the script provided, which appends the -1, then just reindex again to remove the last two characters. I have no experience/knowledge of bash and considering we're seeing good performance, it seems to be the most efficient method.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.