Corrupted merge when migrating from 0.20.5 to 0.90.5


(george_monroe) #1

Guys,

Several attempts now we are getting corrupted merge exceptions when
migrating from 0.20 to 0.90. Any ideas? What are we doing wrong?

Migration steps:
(1) stop indexing incoming data (close index)
(2) flush index
(3) shut down node (service stop)
(4) back up data folder and zip it up
(5) upgrade to ES 0.90.5
(5) transfer zip to new AWS server
(6) unzip data and copy folder to the new elasticsearch data location
(7) change cluster name (cluster name change needed because of environments)

All shards come back just fine. Then we try to index new data into the new
ES 90.5 cluster and one shard dies and we get the following:

org.apache.lucene.index.CorruptIndexException: docs out of order (287 <=
287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.startDoc(Lucene41PostingsWriter.java:243)
at
org.apache.lucene.codecs.PostingsConsumer.merge(PostingsConsumer.java:115)
at org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:164)
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:365)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:91)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
[2013-10-31 17:00:16,808][WARN ][index.engine.robin ] [
ocho-intdev-newelasticsearch01.us-east-1a.dfengg.com] [ocho][3] failed
engine
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: docs out of order (287 <=
287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)
at
org.elasticsearch.index.merge.scheduler.ConcurrentMergeSchedulerProvider$CustomConcurrentMergeScheduler.handleMergeException(ConcurrentMergeSchedulerProvider.java:99)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: org.apache.lucene.index.CorruptIndexException: docs out of order
(287 <= 287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)
at
org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.startDoc(Lucene41PostingsWriter.java:243)
at
org.apache.lucene.codecs.PostingsConsumer.merge(PostingsConsumer.java:115)
at org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:164)
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:365)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3772)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3376)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:91)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
[2013-10-31 17:00:16,813][WARN ][cluster.action.shard ] [
ocho-intdev-newelasticsearch01.us-east-1a.dfengg.com] sending failed shard
for [ocho][3], node[BHN9A7FKT6SJqJfIxeRnPA], [P], s[STARTED], reason
[engine failure, message
[MergeException[org.apache.lucene.index.CorruptIndexException: docs out of
order (287 <= 287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)];
nested: CorruptIndexException[docs out of order (287 <= 287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)];
]]
[2013-10-31 17:00:16,816][WARN ][cluster.action.shard ] [
ocho-intdev-newelasticsearch01.us-east-1a.dfengg.com] received shard failed
for [ocho][3], node[BHN9A7FKT6SJqJfIxeRnPA], [P], s[STARTED], reason
[engine failure, message
[MergeException[org.apache.lucene.index.CorruptIndexException: docs out of
order (287 <= 287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)];
nested: CorruptIndexException[docs out of order (287 <= 287 ) (docOut:
org.apache.lucene.store.RateLimitedFSDirectory$RateLimitedIndexOutput@2342d884)];
]]

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #2