Error: entity content is too long [105072697] for the configured buffer limit [104857600]

Chen_Wen · March 26, 2026, 10:10am

When I query data from ES8.2.3 with a big size in one req that the index docs are more than 10K and got en error Error: entity content is too long [105072697] for the configured buffer limit [104857600].

After search the doc there is one hardcode restrict to 100mb for response content limit. Can we have a way to work around to enalrge this limit?

SamGulinello · March 26, 2026, 2:46pm

Have you tried increasing http.max_content_length in your elasticsearch.yml? The docs say it is defaulted to 100mb but can be increased. Networking | Elasticsearch Guide [8.2] | Elastic

RainTown · March 26, 2026, 2:49pm

Welcome to the forum

What client are you using? There is often a client HTTP response limit, and that is probably what you are seeing. You could validate by sending exact same query with curl which should just stream the response rather than buffering it. If that works, it would strongly suggest a client limitation.

Note in many cases, such large (100MB+) replies are not ideal, think resource usage amongst other reasons. There are alternatives using scroll API or search_after or ... You dont tell us enough to advise on this, but something also to think about.

RainTown · March 26, 2026, 2:54pm

The docs say (my emphasis)

http.max_content_length: Maximum size of an HTTP request body. Defaults to 100mb.

I believe @Chen_Wen is hitting error on 100mb+ responses rather than requests.

Chen_Wen · March 27, 2026, 1:35am

That’s correct, I used trino sql engine to connect ES and use its raw query to request log in one request. A part query as below. For a big time range query which would reponse a little more bigger than 100mb content in the response body and got error as above.

SELECT result
      FROM TABLE(elastic_mozart.system.raw_query(schema => 'default', index => 'ailogs-oneapi', query => '{
  "size": 10000000,
  "query": {
   "bool": {
   "filter": [
    {.....

I know that I could use scroll streaming api with curl, but my case is using trino elasetic plugin which I couldn’t attach the raw_query function. Also for sure I can use dummy SQL query instead the raw_query but which is really slow performance. And for most of my case 100mb is fine, only a little bit case to query full data which hit the limitation.

I read relative talk in below session, but have no idea how could work around to enlarge a little bit more than 100mb is OK to me.

dadoonet · March 27, 2026, 3:32am

I looked at the documentation and found that they are using scroll behind the scene and I guess it works only with something else than RAW queries.

elasticsearch.scroll-size Sets the maximum number of hits that can be returned with eachElasticsearch scroll request. 1000

The raw query you shared is not supposed to work in Elasticsearch unless you changed the default index settings. Did you?

Can't you use a standard SQL Query instead? What is the full query you are sending?

Chen_Wen · March 27, 2026, 5:46am

Thanks @dadoonet checking, yes I could. Actually I did use the standard SQL via scroll request at beginning, but the performance is really slow than the raw_query which I could take advantage the fast query on Elasticsearch end to propare the data.

I did enlarge the index setting to over scroll request 1000. And for sure its NOT the ES query proble. But the data size that response content bigger than 100mb.

I just went through trino/plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/client/ElasticsearchClient.java at master · trinodb/trino · GitHub tried to find a way could customized the response content size restirction. But seems it depends the ES official client class…

elastic_mozart.system.raw_query(
  schema => 'default',  
  index => 'ailogs-oneapi',  
  query => '{
  "size": 10000000,
  "query": {
   "bool": {
   "filter": [
    {
    "range": {
     "@timestamp": {
     {%- set bounds = [] -%}
      
     {%- if from_dttm -%}
      {%- set _ = bounds.append("\"gte\": \"" ~ (from_dttm | string | replace(" ", "T")) ~ "\"") -%}
     {%- endif -%}
      
     {%- if to_dttm -%}
      {%- set _ = bounds.append("\"lte\": \"" ~ (to_dttm | string | replace(" ", "T")) ~ "\"") -%}
     {%- endif -%}

     {{ bounds | join(", ") if bounds else "\"gte\": \"0\"" }}
     }
    }
    },
    {
    "exists": { "field": "org_cid" }
    }
  
    {%- set selected_users = filter_values("username") -%}
    {%- if selected_users -%}
    , {
    "terms": {
     "calc_coreid": [
     {%- for user in selected_users -%}
     "{{ user }}"{% if not loop.last %}, {% endif %}
     {%- endfor -%}
     ]
    }
    }
    {%- endif -%}
     
    {%- set selected_managers = filter_values("manager") -%}
    {%- if selected_managers -%}
    , {
    "bool": {
     "minimum_should_match": 1,
     "should": [
     {%- for mgr in selected_managers -%}
     {
     "wildcard": {
     "org_chain.keyword": "*{{ mgr }}*"
     }
     }{% if not loop.last %}, {% endif %}
     {%- endfor -%}
     ]
    }
    }
    {%- endif -%}
   
    {%- set selected_models = filter_values("modelname") -%}
    {%- if selected_models -%}
    , {
    "terms": {
     "oneapi_modelname.keyword": [
     {%- for model in selected_models -%}
     "{{ model }}"{% if not loop.last %}, {% endif %}
     {%- endfor -%}
     ]
    }
    }
    {%- endif -%}

   ],
   "must_not": [
    {
     "term": {
      "org_cid.keyword": "INVALID"
     }
    },
    {
     "wildcard": {
      "org_cid.keyword": "*-api"
     }
    }
   ]
   }
  },
  "_source": ["@timestamp", "org_cid", "oneapi_modelname", "org_chain"],
  "fields": ["calc_coreid"]
  }'

RainTown · March 27, 2026, 9:29am

My gut feeling is that you are on the wrong road, expecting a sort of bulk exporter. There are other tools for that. But obviously you know your use case and limitations better than I.

If you are just over the 100mb cusp, consider whether you can drop fields from the returned response, i.e. reduce the returned fields to the absolute minimum and it might fit? But that would be kicking the can down the road a little bit.

Good luck. Maybe someone else can help you bump the limit ....

Chen_Wen · March 27, 2026, 10:21am

Thanks remind @RainTown I still want to try compile the target jar: elasticsearch-rest-client-6.8.23.jar And found this class elasticsearch/client/rest/src/main/java/org/elasticsearch/client/HttpAsyncResponseConsumerFactory.java at 706067211ae880bbe4669286ee976552e8a60446 · elastic/elasticsearch · GitHub

But I don’t know how to compile it. Could you guide me how I could generate this client jar with specific version I used 6.8.23

dadoonet · March 27, 2026, 6:00pm

What is your elasticsearch version?

I saw that Trino is compatible with 8.x. So I don’t understand what this 6.x version is doing here.

Very unclear to me.
As Trino seems to maintain the connector, I’d ask them for support. There’s nothing wrong on Elasticsearch side.

If you are still using ES 6.x, it’s a matter of urgency to upgrade your cluster. 6.x did not get a lot of the security patches. Please update to 9.x.

Chen_Wen · March 30, 2026, 1:47am

Hi @dadoonet my ES version is 8.2.3 But trino’s elasticsearch plugin is useing 6.8.23 elasticsearch-rest-clitne jar I think which is compatable for ES8.2.3

Adjusted, I jsut successfully upgrade trino to latest version 480 which point to ES7.17.29

Chen_Wen · March 30, 2026, 8:11am

All, figured out a work aournd way: https://github.com/elastic/elasticsearch/blob/v7.17.29/client/rest/src/main/java/org/elasticsearch/client/HttpAsyncResponseConsumerFactory.java adjust the default limitation here and gradle cmd gradle :client:rest:jar -x javadoc -x test to get a customized client jar for resolve.

dadoonet · March 30, 2026, 8:34am

ES7.17.29 is indeed better but still old and coming with a tons of non needed dependencies…

And if they are using only the old low level client, there’s no need to import the rest high level client.

If they do use the old client, they would need to upgrade it to the new one which much much better.

But, coming back to the initial discussion I think you might be shooting in your own foot.

When looking at the query you shared, I think you can may be translate it a SQL query and then leverage all what Trino has built for extracting a huge dataset.

Chen_Wen · March 30, 2026, 10:16am

Thanks @dadoonet remind. I did try to use stand SQL via trino-elastcisearch plugin which using scroll api to fetch data but the performance is extreamlly slow, even the plan phase would take almost 30s, no matter the execution phase would take 5-8min for the 1million data set.

I turnned to use the raw_query which pass the elasticseach query directly to ES end for quick fetching the data and response to trino, which only took no more than 1min for total 1million data. That is faster 5-8times than standard SQL

dadoonet · March 30, 2026, 12:00pm

Could you share the query you tried? Did you report this to the trino project?

Chen_Wen · March 31, 2026, 1:57am

Hey, not yet, you see this very simple sql: SELECT org_cid,org_chain, oneapi_tokenname
FROM elastic_mozart.default."ailogs-oneapi"
LIMIT 100001 And that performance is just after I did the config tuneing elasticsearch.scroll-size=10000 The planning time always took 20-30s. No matter When my sql has aggration operatin like group by where which took 2-8min

But when I turn to use raw_query which fetch all data from ES without scroll api(multi time request) which took no more than 10s

Planning Time	22.68s

dadoonet · March 31, 2026, 8:49am

And how long does it take to run the following:

SELECT org_cid,org_chain, oneapi_tokenname
FROM elastic_mozart.default."ailogs-oneapi"
LIMIT 10000

Topic		Replies	Views
Elasticsearch-rest-client-5.6.1.jar org.apache.http.ContentTooLongException: entity content is too long [217056451] for the configured buffer limit [104857600] Elasticsearch	6	16317	September 22, 2017
Entity content is too long [111285333] for the configured buffer limit [104857600] Elastic Agent	0	116	October 10, 2025
ElasticSearch Java Client 8.1.2 - Entity content is too large for the configured buffer limit Elasticsearch docker	2	5103	May 2, 2022
Eorror "entity content is too long [] for the configured buffer limit [104857600]" Elasticsearch	1	1798	August 31, 2020
java.io.IOException: entity content is too long [110846684] for the configured buffer limit [104857600] Elasticsearch docker	0	364	July 30, 2024

Error: entity content is too long [105072697] for the configured buffer limit [104857600]

Related topics