I'm working on an ES upgrade from v0.20.5 to v1.2.1
I tested in a 2 node cluster, 3 indices, ~4 million docs, 18G file sizes,
20 shards, 1 replicas
However, after bumping the version and reboot the cluster, I kept on seeing
some shards are damaged. The ES log said:
Caused by: org.apache.lucene.index.CorruptIndexException: did not read all
bytes from file: read 451 vs size 452 (resource:
BufferedChecksumIndexInput(MMapIndexInput(path="/18/index/_195c_i.del")))
This badly blocked the version upgrade in my case.
Could you any one point me the reason of this issue?
Lots appreciate to your help!
We did not run into this issue when we upgraded from 20.6 to 1.3.1, but
from looking at the upgrade docs, we did a few things to try and protect
any index corruption, which looks like what you ran into.
1 - we stopped any apps from writing to the indexes when we started our
upgrade
2 - we flushed the cluster before bringing it down
3 - we disabled shard allocation/replication before bringing the cluster
down (just to make sure all nodes brought back the indexes that were on the
same machines.)
4 - when we brought everything back up, we ran optimize on each index.
This was noted as a task to do, because all of the indexes are in new
formats in the newer releases, so, it was recommend to run the optimize
which would recreate all of the indexes. It was unclear whether older
indexes would really work in upgraded cluster. we did not take the chance
and took the time hit to run optimize.
5 - re-enabled shard/replication allocations
6 - cluster was working just fine
hope our steps help you try and redo the cluster upgrade.
On Tuesday, September 9, 2014 4:18:54 PM UTC-7, Wei wrote:
Hi All,
I'm working on an ES upgrade from v0.20.5 to v1.2.1
I tested in a 2 node cluster, 3 indices, ~4 million docs, 18G file sizes,
20 shards, 1 replicas
However, after bumping the version and reboot the cluster, I kept on
seeing some shards are damaged. The ES log said:
Caused by: org.apache.lucene.index.CorruptIndexException: did not read all
bytes from file: read 451 vs size 452 (resource:
BufferedChecksumIndexInput(MMapIndexInput(path="/18/index/_195c_i.del")))
This badly blocked the version upgrade in my case.
Could you any one point me the reason of this issue?
Lots appreciate to your help!
I'm working on an ES upgrade from v0.20.5 to v1.2.1
I tested in a 2 node cluster, 3 indices, ~4 million docs, 18G file sizes, 20
shards, 1 replicas
However, after bumping the version and reboot the cluster, I kept on seeing
some shards are damaged. The ES log said:
Caused by: org.apache.lucene.index.CorruptIndexException: did not read all
bytes from file: read 451 vs size 452 (resource:
BufferedChecksumIndexInput(MMapIndexInput(path="/18/index/_195c_i.del")))
This badly blocked the version upgrade in my case.
Could you any one point me the reason of this issue?
Lots appreciate to your help!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.