Elasticsearch-rest-client-5.6.1.jar org.apache.http.ContentTooLongException: entity content is too long [217056451] for the configured buffer limit [104857600]


#1

I am using REST low level api to query the ES 5.4.0 or 5.5.2 server. and got the following exceptions:
Anyway to change the bufferLimitBytes?

12:34:50,453 INFO [stdout] (default task-2) org.apache.http.ContentTooLongException: entity content is too long [217056451] for the configured buffer limit [104857600]
12:34:50,453 INFO [stdout] (default task-2) at org.elasticsearch.client.HeapBufferedAsyncResponseConsumer.onEntityEnclosed(HeapBufferedAsyncResponseConsumer.java:76) ~[elasticsearch-rest-client-5.6.1.jar:5.6.1]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.nio.protocol.AbstractAsyncResponseConsumer.responseReceived(AbstractAsyncResponseConsumer.java:131) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.client.MainClientExec.responseReceived(MainClientExec.java:315) ~[httpasyncclient-4.1.2.jar:4.1.2]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseReceived(DefaultClientExchangeHandlerImpl.java:147) ~[httpasyncclient-4.1.2.jar:4.1.2]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.responseReceived(HttpAsyncRequestExecutor.java:303) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:255) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) ~[httpasyncclient-4.1.2.jar:4.1.2]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) ~[httpasyncclient-4.1.2.jar:4.1.2]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588) ~[httpcore-nio-4.4.5.jar:4.4.5]
12:34:50,453 INFO [stdout] (default task-2) at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
12:34:50,453 INFO [stdout] (default task-2) Suppressed: org.apache.http.ContentTooLongException: entity content is too long [217056451] for the configured buffer limit [104857600]


(David Pilato) #2

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

The first question is why are you getting back a so big Response? What kind of query are you running.

To answer your question, have a look at


#3

The above link does not help. I am really retrieving that much data.
I am retrieving 180k documents (setting the index.max_result_window=1Million). So, query size is set at 1,000,000. Query result >200MB is normal. If I set the query size=80k. Result is just below 100MB, everything works fine. I know I can use scroll api to get the data piece by piece. But this is just an occasional thing. If the buffersize is not default to 100MB and is settable, it would be nice.

I am using low level REST api to do the query, any better way to fetch millions of records in one shot?
Thanks


(David Pilato) #4

You definitely need to use scroll. By default elasticsearch does not support more than 10000 results (from+size) for very good reasons like cluster stability, OOM...


#5

I understand the 10k limit, that is why I increase the default to 1M. If ES lets people change the default of query results, then the buffer size should be changeable also, otherwise bufferLimitBytes set(100M) in org.elasticsearch.client.HeapBufferedAsyncResponseConsumer is the real limit:)
Thanks


(David Pilato) #6

I'd not do that.

But if you really want to overload your data nodes and your client, up to you.

Did you read the workaround that Tanguy mentioned in the issue I linked to?

Specifically that class:


#7

Nice. I did not notice that ES has the following function:

Response performRequest(String method, String endpoint,
                        Map<String, String> params,
                        HttpEntity entity,
                        HttpAsyncResponseConsumerFactory responseConsumerFactory,
                        Header... headers)
    throws IOException;

This makes the buffer size configurable
This is what I did:

 //Overiding the 100MB Buffer Limit to 1GB
response = client.performRequest("GET", request.getIndexName() + typesString + "/_search", new HashMap<>(),
					new NStringEntity(queryString, ContentType.APPLICATION_JSON),
					new HeapBufferedResponseConsumerFactory(1024 * 1024 * 1024));

It is working now. Thanks for the help @dadoonet


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.