We've recently upgraded our hardware, and have now got a replacement ES cluster.
I want to restore the indices on our old cluster into/onto the new cluster without losing what is currently on the new cluster. I.e. I want to merge the data.
Which is the best way to achieve this? I've read the snapshot/restore docs and it appears that the restore doesn't have a non-destructive/merge feature, only a "drop and recreate" (to borrow an antiquated phrase) method.
This is all on Windows, which I'm sure I'll be told makes things even more difficult.
The challenge here from Elasticsearch's prospective is how to perform the 'merge' of the data. If there is a clash between the clusters (i.e. a document already exists on the new cluster when you index from the old cluster) what should ES do? Keep the new one? The old one? or maybe there is something in the document that would indicate which one to keep? These are questions that are much easier for you to answer in a custom application/tool than for Elasticsearch to be able to provide options for out of the box.
In which case another option could be to restore the old clusters data to a different index in the new cluster and use an alias to query both at the same time.
I've gone for the small app option with scroll and bulk API calls, but I notice I am not getting the _type information - which is quite essential to what we are doing.
I seem to remember ES just simply doesn't return this field in a search - but surely there's a way to force it to do so?
Actually I can see what's happening here.. I am getting he _id/_type/etc fields returned, but NEST is not including them when it parses the hits into the Documents collection. This is going to take some wrestling.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.