Bulk inserts bring down query performance

Our application occasionally performs large volumes of index insert/updates, it could be in the range of 10,000. To speed this up, we're allocating the upload/insert work across 10 threads, running continuously until all of the inserts/updates are complete.

We're finding that the inserts/updates are having a substantial impact on query performance, essentially taking query capability offline (too slow to be usable).

We're running in elastic.co, IO Optimized, 3 zones, 1 node per zone, 8GB per node.

If we're finding that query performance is fine, except when performing bulk update/insert, which of our environment paramters should we increase? more nods, more ram, something else?

Appreciate any specific or general guidance.

Indexing can be CPU as well as disk I/O intense. It may be worth looking at monitoring data to see if you can see what the limiting factor is. I believe CPU resources are allocated proportional to the size of the node so I would recommend increasing the size of the nodes and see what impact this has. You can also reduce the concurrency and maybe also turn off refreshing while you bulk upload and then turn it back on once you are done.

Thanks @Christian_Dahlqvist. We have tried disabling refreshes, this didn't help. The idea that CPU resources are allocated proportional to the size of the node sounds promising. But... I don't see this documented? We can give it a try, in any case.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.