Agreed, though, as data grows, it makes less and less sense to have this feature simply because of the overhead it will have. Though, because of the delta based snapshotting, it can work well for shared gateway based cases, though I would love to try and solve it also for local gateway cases as well...
On Saturday, November 20, 2010 at 5:03 AM, Berkay Mollamustafaoglu wrote:
Understood :) Â Just FYI, there are uses other than corruption for such an API call. It is often necessary to get the index out of production environment and move it to another ES instance for troubleshooting, testing, etc.Â
Regards,Berkay Mollamustafaoglumberkay on yahoo, google and skype
On Fri, Nov 19, 2010 at 5:41 PM, Shay Banon shay.banon@elasticsearch.com wrote:
Yes, there can be an API for that, just requires some work ;). But, regarding the chances of an index being corrupted, there is a lot of work on the Lucene level to make sure that this does not happen, and elasticsearch builds on that as well, and I feel pretty good with 0.13 and how it handles this cases.
On Saturday, November 20, 2010 at 12:33 AM, Berkay Mollamustafaoglu wrote:
There is always a potential for something to go wrong especially in the early days, and replicas are not really a solution since if there is a corruption, there is a potential it impacts the replicas as well.Â
It is much better for mental health to have an offline copy of the data that you can verify. Ideally, if we can get a backup of the an index and start it etc. somewhere else (another ES cluster), etc. Can there be an API call that backs up the specified indices?Â
Regards,Berkay Mollamustafaoglumberkay on yahoo, google and skype
On Fri, Nov 19, 2010 at 5:01 PM, Shay Banon shay.banon@elasticsearch.com wrote:
Hi Paul,
  Basically, the gateway is adaptive, if things are not as expected, they will get copied over. So, in theory, you should be able to unmount and mount to a new location, and a new snapshot will happen. I say in theory since I have not tested it, and I believe that at least with the file system based gateway, I need to add checks that the directed are there and created, and if not, recreate them. What do you think?
  Regarding the backup of the gateway data, what do you mean by missing data? You should see data basically up to the point where you cp -R it (or a bit later).  We talked about it a bit, but the local gateway should really simplify things without the need to have shared file system. You do loos
e the ability to have a "backup", but basically, if all works according to plan, the replicas are your backup...-shay.banon
On Friday, November 19, 2010 at 10:30 PM, Paul wrote:
Hey, First off, congrats on 0.13.0. Been eagerly awaiting this updatebecause of my fear of index corruption with 0.12. We had some production issues that are the impetus for this request.
Basically, our gateway got blew up. In this case, an iSCSI link wentbad, causing massive disk corruption. We're making updates to avoidcorruption in this case. After this occurred I knew that the data onthe gateway was probably shot. However, my three node cluster was
still up and happily indexing, just no longer had a gateway to writeto. Tons of file not found exceptions.What would have been awesome to be able to do at this point was bringup another share and send a command to the cluster to snapshot it's
entire state down to that share. Obviously, this would be an expensiveoperation, very similar to full gateway recovery, just in reverse.Similarly, this would be a feature that would allow for reliable hot
back-ups of the gateway. The reason that I say reliable, is that I hadhad a copy of my gateway data from a couple weeks ago that I attemptedto restore. This copy was made with a cp -R command while we were
actively indexing. The gateway came up, however, ~25% of the contentwas missing. It was from the larger indexes that have the morefrequent updates. I know it has been stated, that a gateway copyshould always be valid, but my experience is that is not the case with
~30GB of data that are receiving a flow of updates, a few per secondor so.I'd happily raise a feature request for this, if it is something thatis doable with out massive rework.Thanks!Paul