Probably need some better error detection...
Thanks for the help.
...Ken
Shay Banon wrote:
It means that the gateway store got corrupted. You will have to rebuild
the index. Probably due to all HEAD changes... . Hopefully its getting
stable now.-shay.banon
On Fri, Aug 20, 2010 at 8:13 PM, Kenneth Loafman
<kenneth.loafman@gmail.com mailto:kenneth.loafman@gmail.com> wrote:Looks like a file may be missing on the gateway... this repeats in the log over and over. [12:10:00,597][WARN ][indices.cluster ] [Magilla] [twitter][1] failed to start shard org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [twitter][1] Failed to recover translog at org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.recoverTranslog(BlobStoreIndexShardGateway.java:516) at org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.recover(BlobStoreIndexShardGateway.java:417) at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) Caused by: org.elasticsearch.index.engine.EngineCreationFailureException: [twitter][1] Failed to open reader on writer at org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:171) at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryPrepareForTranslog(InternalIndexShard.java:405) at org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.recoverTranslog(BlobStoreIndexShardGateway.java:440) ... 5 more Caused by: java.io.FileNotFoundException: /mnt/search-data-dev/elasticsearch/nodes/1/indices/twitter/1/index/_d8g.cfs (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233) at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.<init>(SimpleFSDirectory.java:76) at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.<init>(SimpleFSDirectory.java:97) at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFSDirectory.java:87) at org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67) at org.elasticsearch.index.store.support.AbstractStore$StoreDirectory.openInput(AbstractStore.java:287) at org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:67) at org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReader.java:114) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590) at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616) at org.apache.lucene.index.IndexWriter$ReaderPool.getReadOnlyClone(IndexWriter.java:574) at org.apache.lucene.index.DirectoryReader.<init>(DirectoryReader.java:150) at org.apache.lucene.index.ReadOnlyDirectoryReader.<init>(ReadOnlyDirectoryReader.java:36) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:410) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:374) at org.elasticsearch.index.engine.robin.RobinEngine.buildNrtResource(RobinEngine.java:538) at org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:158) ... 7 more [12:10:00,605][WARN ][cluster.action.shard ] [Magilla] sending failed shard for [twitter][1], node[10dab323-019b-4036-854f-89bb068dcc8d], [P], s[INITIALIZING], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[twitter][1] Failed to recover translog]; nested: EngineCreationFailureException[[twitter][1] Failed to open reader on writer]; nested: FileNotFoundException[/mnt/search-data-dev/elasticsearch/nodes/1/indices/twitter/1/index/_d8g.cfs (No such file or directory)]; ]] Shay Banon wrote: > Also, use the latest again, pushed some more fixes. > > On Fri, Aug 20, 2010 at 8:04 PM, Shay Banon > <shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>>> wrote: > > Do you see any exceptions in the logs (failing to start the shard)? > > > On Fri, Aug 20, 2010 at 8:02 PM, Kenneth Loafman > <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> wrote: > > Now its looping: progress is going to 100, then starting over. > > I set up a 1/second loop using: > while /bin/true; do date; curl -XGET > 'http://192.168.1.5:9200/twitter/_status?pretty=true'; sleep 1; done > then copied it to gist at: http://gist.github.com/540711 > > It should have recovered by now, I would think. > > ...Ken > > Shay Banon wrote: > > great, ping me if it does not end, I am here to help (we can > make it > > more interactive on IRC). > > > > p.s. Can you keep the original json format when you gist? Much > easier to > > know whats going on. You can add pretty=true as a parameter to > get it > > pretty printed. > > > > -shay.banon > > > > On Fri, Aug 20, 2010 at 5:51 PM, Kenneth Loafman > > <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> wrote: > > > > I think so... Here's the latest on gist > http://gist.github.com/540471 > > > > Thanks for the pointer on gist, I've never used it before. > > > > Shay Banon wrote: > > > The top just states which shards were queries, a shard > that is > > still not > > > allocated will obviously not be allocated. It seems like > its still in > > > recovery process. There are two main APIs to really > understand what is > > > going on (except for the high level health api), the > cluster state > > API, > > > that shows you what the cluster wide state is (where > each shard is > > > supposed to be, what its state is), and the status api > which gives you > > > detailed information of the status of each shard > allocated on each > > node. > > > > > > Is the recovery progressing? > > > > > > p.s. Can you use gist instead of pastebin? > > > > > > -shay.banon > > > > > > On Fri, Aug 20, 2010 at 5:13 PM, Kenneth Loafman > > > <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> wrote: > > > > > > I restarted and now 35 of 36 are successful, but if > you look > > at the > > > status, it's showing multiple shards in recovery. > I'm confused. > > > > > > See cluster status in http://pastebin.com/9qWLf3mk > > > > > > Kenneth Loafman wrote: > > > > Will do so in just a bit... > > > > > > > > Shay Banon wrote: > > > >> ... can you test? > > > >> > > > >> On Fri, Aug 20, 2010 at 4:02 PM, Shay Banon > > > >> <shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>> > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>>> > > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>> > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>>>> > > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>> > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>>> > > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>> > > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com> > <mailto:shay.banon@elasticsearch.com <mailto:shay.banon@elasticsearch.com>>>>>> wrote: > > > >> > > > >> Just pushed a fix for this. > > > >> > > > >> > > > >> On Fri, Aug 20, 2010 at 3:31 PM, Kenneth Loafman > > > >> <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>> wrote: > > > >> > > > >> Attachments did not make it. See: > > > >> http://pastebin.com/ziALRgx5 -- cluster state > > > >> http://pastebin.com/63Xm95xM -- index status > > > >> > > > >> Sorry, they lost their formatting on Pastbin. > > > >> > > > >> ...Ken > > > >> > > > >> Kenneth Loafman wrote: > > > >> > I upgraded to last nights version, > restarted, and > > > things are > > > >> worse. Now > > > >> > I have 5 shards hung at recover, not > all on the same > > > node. Weird. > > > >> > > > > >> > I've attached the info you want. I'll > leave > > things running > > > >> for now. > > > >> > > > > >> > ...Thanks, > > > >> > ...Ken > > > >> > > > > >> > Shay Banon wrote: > > > >> >> Do you still have it running? Can you > gist the > > cluster > > > state > > > >> and the > > > >> >> index status results? > > > >> >> > > > >> >> I see that you are using master, I > have fixed > > several > > > things > > > >> in this > > > >> >> area, can you pull a new version? > > > >> >> > > > >> >> -shay.banon > > > >> >> > > > >> >> On Fri, Aug 20, 2010 at 12:33 AM, > Kenneth Loafman > > > >> >> <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>>> wrote: > > > >> >> > > > >> >> It seems to have started recover, > but it's been > > > 7.5 hours > > > >> and appears to > > > >> >> be stopped/hung... > > > >> >> > > > >> >> > "1": [ > > > >> >> > { > > > >> >> > > "gateway_recovery": { > > > >> >> > > "index": { > > > >> >> > > > > >> "expected_recovered_size": "0b", > > > >> >> > > > > >> "expected_recovered_size_in_bytes": 0, > > > >> >> > > > > "recovered_size": "0b", > > > >> >> > > > > >> "recovered_size_in_bytes": 0, > > > >> >> > > > "reused_size": "0b", > > > >> >> > > > > "reused_size_in_bytes": 0, > > > >> >> > > "size": "0b", > > > >> >> > > > "size_in_bytes": 0, > > > >> >> > > > > "throttling_time": "0s", > > > >> >> > > > > >> "throttling_time_in_millis": 0 > > > >> >> > }, > > > >> >> > > "stage": "RETRY", > > > >> >> > > > "start_time_in_millis": > > > >> 1282226019603, > > > >> >> > > "throttling_time": > > > "7.6h", > > > >> >> > > > > >> "throttling_time_in_millis": 27514627, > > > >> >> > > "time": "7.6h", > > > >> >> > > "time_in_millis": > > > 27514657, > > > >> >> > > "translog": { > > > >> >> > > "recovered": 0 > > > >> >> > } > > > >> >> > }, > > > >> >> > "index": { > > > >> >> > > "size": "0b", > > > >> >> > > "size_in_bytes": 0 > > > >> >> > }, > > > >> >> > "routing": { > > > >> >> > "index": > > "twitter", > > > >> >> > "node": > > > >> >> > "031642a1-968f-40fb-b7c2-5a869769d5b4", > > > >> >> > > "primary": true, > > > >> >> > > > "relocating_node": null, > > > >> >> > > "shard": 1, > > > >> >> > "state": > > "INITIALIZING" > > > >> >> > }, > > > >> >> > "state": > "RECOVERING" > > > >> >> > } > > > >> >> > ] > > > >> >> > > > >> >> > > > >> >> Shay Banon wrote: > > > >> >> > It should be allocated on the > other node, you > > > shouldn't > > > >> need to start > > > >> >> > another node. When you issue a > cluster health > > > (simple > > > >> curl can > > > >> >> do), what > > > >> >> > is the status? The cluster state > API gives > > you more > > > >> information if you > > > >> >> > are after (each shard and its > state). > > > >> >> > > > > >> >> > On Thu, Aug 19, 2010 at 3:48 PM, > Kenneth > > Loafman > > > >> >> > <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>> > > > >> >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>>>> wrote: > > > >> >> > > > > >> >> > No this is the first time. The > > shutdown took a > > > >> while with several > > > >> >> > 'Waiting for not to > shutdown..." style > > > message. It > > > >> came up bad > > > >> >> > after that. > > > >> >> > > > > >> >> > So, if I have two nodes now, > and one > > needs to be > > > >> recovered, > > > >> >> I'll need 3 > > > >> >> > nodes to get the recovery done? > > > >> >> > > > > >> >> > ...Ken > > > >> >> > > > > >> >> > Shay Banon wrote: > > > >> >> > > The shard will allocated > to another > > node and > > > >> recovered there. Do > > > >> >> > you see > > > >> >> > > it happen continuously? > > > >> >> > > > > > >> >> > > -shay.banon > > > >> >> > > > > > >> >> > > On Thu, Aug 19, 2010 at > 2:28 PM, Kenneth > > > Loafman > > > >> >> > > <kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>>> > > > >> >> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>> > > > >> >> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>> > > > >> >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>> > > > >> <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>> > > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>> > > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com> > <mailto:kenneth.loafman@gmail.com <mailto:kenneth.loafman@gmail.com>>>>>>>>> wrote: > > > >> >> > > > > > >> >> > > Hi, > > > >> >> > > > > > >> >> > > The second shard on > one of my > > indexes has > > > >> failed due to: > > > >> >> > > [05:59:47,332][WARN > ][index.gateway > > > >> ] [Mangog] > > > >> >> > [twitter][1] > > > >> >> > > failed to snapshot on > close > > > >> >> > > ...followed by a long > traceback. > > > >> >> > > ...followed by: > > > >> >> > > [05:59:49,336][WARN > > ][cluster.action.shard > > > >> ] [Mangog] > > > >> >> > received shard > > > >> >> > > failed for [twitter][1], > > > >> >> > > > node[86d601df-e124-45ed-a5f2-57d762042d87], > > > >> >> > > [P], s[INITIALIZING], > reason [Failed > > > to start > > > >> shard, message > > > >> >> > > > > > >> > [IndexShardGatewayRecoveryException[[twitter][1] > > Failed to > > > >> >> > recovery > > > >> >> > > translog]; nested: > > > >> >> > EngineCreationFailureException[[twitter][1] > > > >> >> > Failed to > > > >> >> > > open reader on > writer]; nested: > > > >> >> > > > > > >> >> > > > > >> >> > > > >> > > > > > > FileNotFoundException[/mnt/search-data-dev/elasticsearch/nodes/0/indices/twitter/1/index/_d8g.cfs > > > >> >> > > (No such file or > directory)]; ]] > > > >> >> > > > > > >> >> > > Is the recovery process > > automatic, or do I > > > >> have to do > > > >> >> something > > > >> >> > > special? It appears > to be just this > > > one shard. > > > >> >> > > > > > >> >> > > I use the service > wrapper to > > start/stop > > > >> 0.9.1-SNAPSHOT, > > > >> >> and my > > > >> >> > config is > > > >> >> > > below. > > > >> >> > > > > > >> >> > > ...Thanks, > > > >> >> > > ...Ken > > > >> >> > > > > > >> >> > > cloud: > > > >> >> > > aws: > > > >> >> > > access_key: ***** > > > >> >> > > secret_key: ***** > > > >> >> > > > > > >> >> > > gateway: > > > >> >> > > type: s3 > > > >> >> > > s3: > > > >> >> > > bucket: ***** > > > >> >> > > > > > >> >> > > path : > > > >> >> > > work : > /mnt/search-data-dev > > > >> >> > > logs : > > /mnt/search-data-dev/node1/logs > > > >> >> > > > > > >> >> > > index : > > > >> >> > > number_of_shards : 2 > > > >> >> > > number_of_replicas : 1 > > > >> >> > > > > > >> >> > > network : > > > >> >> > > host : 192.168.1.5 > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> >> > > > > >> >> > > > >> >> > > > >> > > > > >> > > > >> > > > >> > > > > > > > > > > > > > > > > >