Python Elasticsearch API: Error 'data too large' when iterating

tmslara.a · June 16, 2023, 2:15pm

I am using the Python Elasticsearch API to interact with my Elastic cluster. I'm getting an error when I try to perform several searches in a for loop. In synthesis, I'm doing the following: I iterate over a list of values. In each iteration, I take a value from the list and formulate a query with it to retrieve a specific set of documents. Using this query I perform a search (client.search(index, query, aggs)) to get a terms aggregation. After some number of iterations, I receive an error of 'data too large'. I'm confident that the result of each aggregation will not retrieve an excessively large number of terms.

I assume there is some kind of buffer that is not emptied after each iteration. I'd appreciate some insights in this matter.

I include the complete error that I get:

BadRequestError(400, 'search_phase_execution_exception', 'task cancelled [Fatal failure during search: failed to merge result [[parent] Data too large, data for [<reduce_aggs>] would be [999285563/952.9mb], which is larger than the limit of [996147200/950mb], real usage: [999285528/952.9mb], new bytes reserved: [35/35b], usages [inflight_requests=952/952b, request=106981446/102mb, fielddata=299829275/285.9mb, eql_sequence=0/0b, model_inference=0/0b]]]')

I've got the same error before when also using for loops. In those cases I partially solved the issue including the creation of the client inside the loop, however, I can't do that in the case that I previously described (also I think is not desirable). I also get an ApiError(429 sometimes)

system · July 14, 2023, 2:15pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problems with aggregations: Data too large Elasticsearch	7	1023	August 3, 2023
Exception in indexing function: ElasticsearchException got in indexing function: TransportError Elasticsearch	20	678	May 28, 2021
Data too large for response [parent] Elasticsearch	2	276	November 28, 2023
Getting issue during running a api command Elasticsearch	11	741	December 30, 2021
Data too large, data for [<agg [1]>] would be larger than limit of [311387750/296.9mb] Elasticsearch	3	5253	July 25, 2017

Python Elasticsearch API: Error 'data too large' when iterating

Related topics