Reduce update_by_query memory use

Jesus_Rodriguez · March 27, 2020, 9:55am

Hi.

In one of the use cases I currently have for Elastic I have an index with relatively large documents. Initially, we chose to go with update_by_query. However, we recently had an OOM problem caused by one of those updates, and when checking the stack I saw that the full document was being loaded in memory.

I have tried to solve this by adding a _source filter to the update_by_query, but that causes every field not included in the source filter to go away on reindexing. Is there any way to reduce the number of fields loaded in the hits without deleting the excluded fields from the original document?

Thanks!

Christian_Dahlqvist · March 29, 2020, 1:57pm

Elasticsearch need to reindex the whole document even if your processing or criteria only touches a few fields, so I believe the answer is no.

system · April 26, 2020, 1:57pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Update partial fields leads to other fields missing that are excluded in the _source Elasticsearch	2	754	January 4, 2021
How to update fields that are not opened in the source document without affecting them Elasticsearch elastic-stack-alerting	14	114	November 11, 2025
Reindexing based on returned fields without loosing the information Elasticsearch	1	335	July 6, 2017
How does partial update work with custom _source? Elasticsearch	6	438	July 6, 2017
Data gets deleted on update by query when the _source field is set on the body Elasticsearch	2	838	June 23, 2017

Reduce update_by_query memory use

Related topics