ContentTooLongException when using Reindex API

Hi there!

Lately, I'm busy with finding an efficient way of reindexing a lot of data from a 1.x cluster to a 5.x cluster, and I'm evaluating some options. Right now, I'm using the Reindex API, generally it looks like a lot of success.

I'm facing issues with what it seems as responses which are too large for the target cluster to swallow.

[2016-10-31T18:23:35,802][DEBUG][o.e.c.RestClient         ] request [POST http://localhost:9205/my-index/_search?size=1000&scroll=5m&sort=_doc%3Aasc] failed
org.apache.http.ContentTooLongException: entity content is too long [15032733] for the configured buffer limit [10485760]
           at org.elasticsearch.client.HeapBufferedAsyncResponseConsumer.onEntityEnclosed(HeapBufferedAsyncResponseConsumer.java:79) ~[rest-5.0.0.jar:5.0.0]
           at org.apache.http.nio.protocol.AbstractAsyncResponseConsumer.responseReceived(AbstractAsyncResponseConsumer.java:131) ~[httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.client.MainClientExec.responseReceived(MainClientExec.java:315) ~[httpasyncclient-4.1.2.jar:4.1.2]
           at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseReceived(DefaultClientExchangeHandlerImpl.java:147) ~[httpasyncclient-4.1.2.jar:4.1.2]
           at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.responseReceived(HttpAsyncRequestExecutor.java:303) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.client.InternalRequestExecutor.responseReceived(InternalRequestExecutor.java:108) [httpasyncclient-4.1.2.jar:4.1.2]
           at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:255) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) [httpasyncclient-4.1.2.jar:4.1.2]
           at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) [httpasyncclient-4.1.2.jar:4.1.2]
           at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) [httpcore-nio-4.4.5.jar:4.4.5]
           at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) [httpcore-nio-4.4.5.jar:4.4.5]
           at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]

Of course, when I reduce the scroll size the _reindex API uses, then it works OK more often.

Is it possible to increase this buffer with a configuration parameter? I briefly skimmed through the code, but it looks like the default value (10 MB) is overriden only in tests.

Thanks!

Haris

No, it isn't possible. This was filed a few days ago but I haven't gotten to it yet:

I will get to it soon. For now you can lower the size in the way the link suggests.

Embarassing that I didn't look in GitHub! Thanks for the quick and detailed reply, and of course thanks for this great piece of software called Elasticsearch!