Failing _reindex: IndexOutOfBoundsException after 40.000 documents

Hey there,

I am a contributor for the Apache James project, we rely on emails for searching emails.

As part of the 3.6.0 release we updated our mapping, implying a _reindex data.

We did run the procedure [1] on several production instances without incident untill I encountered array_out_of_bound exception [2] generating the following log [3]...

ElasticSearch version: 6.3

Indices details:

$ curl -XGET  http://xxxxx:9201/_cat/indices
green open mailbox_v2         CRU8eerkSsmvHZBgcscpTg 5 1 560006183 12232882 377.5gb 188.7gb

-------------------------------- [1] ------------------------

    1. Create a v2 index with the new mapping
curl -XPUT   http://xxx:9200/mailbox_v2 -d '{"mappings":{"_doc":{"dynamic":"strict","_routing":{"required":true},"properties":{"attachments":{"properties":{"contentDisposition":{"type":"keyword"},"fileExtension":{"type":"keyword"},"fileName":{"type":"text","analyzer":"standard"},"mediaType":{"type":"keyword"},"subtype":{"type":"keyword"},"textContent":{"type":"text","analyzer":"standard"}}},"bcc":{"properties":{"address":{"type":"text","fields":{"raw":{"type":"keyword","normalizer":"case_insensitive"}},"analyzer":"standard","search_analyzer":"keep_mail_and_url"},"name":{"type":"text","analyzer":"keep_mail_and_url"}}},"cc":{"properties":{"address":{"type":"text","fields":{"raw":{"type":"keyword","normalizer":"case_insensitive"}},"analyzer":"standard","search_analyzer":"keep_mail_and_url"},"name":{"type":"text","analyzer":"keep_mail_and_url"}}},"date":{"type":"date","format":"yyyy-MM-dd'\''T'\''HH:mm:ssZ"},"from":{"properties":{"address":{"type":"text","fields":{"raw":{"type":"keyword","normalizer":"case_insensitive"}},"analyzer":"standard","search_analyzer":"keep_mail_and_url"},"name":{"type":"text","analyzer":"keep_mail_and_url"}}},"hasAttachment":{"type":"boolean"},"headers":{"type":"nested","properties":{"name":{"type":"keyword"},"value":{"type":"text","analyzer":"keep_mail_and_url"}}},"htmlBody":{"type":"text","analyzer":"standard"},"isAnswered":{"type":"boolean"},"isDeleted":{"type":"boolean"},"isDraft":{"type":"boolean"},"isFlagged":{"type":"boolean"},"isRecent":{"type":"boolean"},"isUnread":{"type":"boolean"},"mailboxId":{"type":"keyword","store":true},"mediaType":{"type":"keyword"},"messageId":{"type":"keyword","store":true},"mimeMessageID":{"type":"keyword"},"modSeq":{"type":"long"},"sentDate":{"type":"date","format":"yyyy-MM-dd'\''T'\''HH:mm:ssZ"},"size":{"type":"long"},"subject":{"type":"text","fields":{"raw":{"type":"keyword","normalizer":"case_insensitive"}},"analyzer":"keep_mail_and_url"},"subtype":{"type":"keyword"},"textBody":{"type":"text","analyzer":"standard"},"to":{"properties":{"address":{"type":"text","fields":{"raw":{"type":"keyword","normalizer":"case_insensitive"}},"analyzer":"standard","search_analyzer":"keep_mail_and_url"},"name":{"type":"text","analyzer":"keep_mail_and_url"}}},"uid":{"type":"long","store":true},"text":{"type":"text"},"userFlags":{"type":"keyword","normalizer":"case_insensitive"}}}},"settings":{"index":{"number_of_shards":"5","analysis":{"filter":{"english_snowball":{"type":"snowball","language":"English"}},"normalizer":{"case_insensitive":{"filter":["lowercase","asciifolding"],"type":"custom","char_filter":[]}},"analyzer":{"keep_mail_and_url":{"filter":["lowercase","stop"],"tokenizer":"uax_url_email"},"snowball_keep_mail_and_token":{"filter":["lowercase","stop","english_snowball"],"tokenizer":"uax_url_email"}}},"number_of_replicas":"1","write":{"wait_for_active_shards":"1"}}}}' -H "Content-Type: application/json"
    1. Expose the new index behind the write alias
 curl -XPOST http://xxx:9200/_aliases -d '{"actions":[{"add":{"index":"mailbox_v2", "alias":"mailboxWriteAlias"}}]}'  -H "Content-Type: application/json"
    1. Trigger the _reindex call
curl -XPOST  http://xxx:9200/_reindex?wait_for_completion=false -d '{"source":{"index":"mailbox_v1"},"dest":{"index":"mailbox_v2"}}'  -H "Content-Type: application/json"

Will return a task than can then be monitored:

curl -XGET  http://xxx:9200/_tasks/q5gmnS29SdePhyLURaDHrg:75653957

On preprod, on a 250MB index this took less than a minute.

    1. Update James to target the new INDEX upon start (in elasticsearch.properties)
    1. Expose the new index behing the aliases, and drop the old alias
curl -XPOST http://xxx:9200/_aliases -d '{"actions":[{"add":{"index":"mailbox_v2", "alias":"mailboxReadAlias"}}]}'  -H "Content-Type: application/json"
curl -XPOST http://xxx:9200/_aliases -d '{"actions":[{"remove":{"index":"mailbox_v1", "alias":"mailboxReadAlias"}}]}'  -H "Content-Type: application/json"
curl -XPOST http://xxx:9200/_aliases -d '{"actions":[{"remove":{"index":"mailbox_v1", "alias":"mailboxWriteAlias"}}]}'  -H "Content-Type: application/json"
curl -XDELETE  http://xxx:9200/mailbox_v1

-------------------------------- [2] ------------------------

[2021-03-03T09:51:58,826][INFO ][o.e.t.LoggingTaskListener] 160536732 finished with response BulkByScrollResponse[took=46.4s,timed_out=false,sliceId=null,updated=40000,created=0,deleted=0,batches=40,versionConflicts=0,noops=0,retries=0,throttledUntil=0s,bulk_failures=[],search_failures=[{"index":"mailbox_v2","shard":1,"node":"uZRBWs_gQheRrUywinzQdQ","reason":{"type":"array_index_out_of_bounds_exception","reason":null}}]]

[2021-03-03T09:53:41,486][DEBUG][o.e.a.s.TransportSearchScrollAction] [xxxxxxxx-master2] [80256] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [xxxxxxxx-master2][xxxxxxxxx:9300][indices:data/read/search[phase/fetch/id/scroll]]
Caused by: java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkBounds(Buffer.java:571) ~[?:1.8.0_272]
at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:264) ~[?:1.8.0_272]
at org.apache.lucene.store.ByteBufferGuard.getBytes(ByteBufferGuard.java:93) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:89) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:99) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState.document(CompressingStoredFieldsReader.java:555) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.document(CompressingStoredFieldsReader.java:571) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:578) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.index.CodecReader.document(CodecReader.java:84) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:341) ~[lucene-core-7.3.1.jar:7.3.1 ae0705edb59eaa567fe13ed3a222fdadc7153680 - caomanhdat - 2018-05-09 09:27:24]
at org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:388) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.search.fetch.FetchPhase.createSearchHit(FetchPhase.java:199) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:156) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:516) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.search.SearchTransportService$10.messageReceived(SearchTransportService.java:430) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.action.search.SearchTransportService$10.messageReceived(SearchTransportService.java:427) ~[elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:259) [x-pack-security-6.3.2.jar:6.3.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:317) [x-pack-security-6.3.2.jar:6.3.2]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) [elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:664) [elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:725) [elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.3.2.jar:6.3.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.3.2.jar:6.3.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_272]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_272]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]

-------------------------------- [3] ------------------------

$ curl -XGET  http://XXXXX:9200/_tasks/uZRBWs_gQheRrUywinzQdQ:160536732 | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   915  100   915    0     0   111k      0 --:--:-- --:--:-- --:--:--  111k
{
  "completed": true,
  "task": {
    "node": "uZRBWs_gQheRrUywinzQdQ",
    "id": 160536732,
    "type": "transport",
    "action": "indices:data/write/reindex",
    "status": {
      "total": 18743957,
      "updated": 40000,
      "created": 0,
      "deleted": 0,
      "batches": 40,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0
    },
    "description": "reindex from [mailbox_v2] to [mailbox_v3]",
    "start_time_in_millis": 1614765072351,
    "running_time_in_nanos": 46459129817,
    "cancellable": true,
    "headers": {}
  },
  "response": {
    "took": 46458,
    "timed_out": false,
    "total": 18743957,
    "updated": 40000,
    "created": 0,
    "deleted": 0,
    "batches": 40,
    "version_conflicts": 0,
    "noops": 0,
    "retries": {
      "bulk": 0,
      "search": 0
    },
    "throttled_millis": 0,
    "requests_per_second": -1,
    "throttled_until_millis": 0,
    "failures": [
      {
        "index": "mailbox_v2",
        "shard": 1,
        "node": "uZRBWs_gQheRrUywinzQdQ",
        "reason": {
          "type": "array_index_out_of_bounds_exception",
          "reason": null
        }
      }
    ]
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.