One primary shards are "lost" permanently when updating data

Hi, all,

I have an urgent case and appreciate any help. I have an index with 5
shards, no replica, and running fine on a single AWS EC2 c3.large box for
months.I now need to update around 1 million data entries in this index. I
use the bulk operations to do the update on 50K batches. After about 300K
updates, my bulk operation started to time out and failed. I then checked
the index status and got

"_shards" : {
"total" : 5,
"successful" : 4,
"failed" : 0
},

One shard is lost permanently. I could still query for data. But when I did
any index operations afterwards, it timed out every few tries. The only way
to fix this is to wipe out the whole index and restore it from snapshots. I
tested Elasticsearch 1.3.5 and 1.4.1, both have this symptom. I tried
pausing for 10 seconds between each bulk updates, setting refresh_rate to
-1. None of them helps.

Strangely, I ran the same operation on my Windows 8 machine and things
worked just fine there. Not sure why it failed so badly on AWS. My data is
stored on a 100G EBS. Can anyone give me some help? I really worry about
the data lost at this time.

Thanks a lot!
Jingzhao

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4b0b35d3-6dfb-41fd-a887-80cddd01cb7b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.