Elasticsearch Python Lib

Vivek_Burman · August 1, 2023, 6:17am

Hi, I'm using Elastic Search python lib (8.8.2) to bulk insert data into a index. There are almost 2lakh+ documents I need to insert.

I'm using parallel_bulk api to sync them, but as I track the process RAM usage I see Parallel_bulk creates threads to process the data and the virtual memory footprint of those threads keeps on increasing.

I have 32GB RAM on the machine where from data is pushed. And Elastic Search is on a separate machine with 32GB RAM.

Is there anyway to limit the memory usage of these threads????
Tried setting thread_count to 1 but it still does not respects it

system · August 29, 2023, 6:18am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use parallel_bulk function Elasticsearch	7	11167	June 17, 2019
Python vs java bulk indexing Elasticsearch	9	1848	July 6, 2017
Control ram usage Elasticsearch	4	403	July 5, 2017
Concurrent bulk requests Elasticsearch	7	1280	May 16, 2018
Tuning indices.memory.max_index_buffer_size for indexing throughput Elasticsearch	1	2614	July 5, 2017

Elasticsearch Python Lib

Related topics