When to use the Bulk API

ChrisFord · October 20, 2017, 4:19pm

Hi

We are currently using the Bulk API to insert documents on initial load of a new index. When it comes to updates and keeping the index in sync with our database, should we still be using the Bulk API or should we just use Update?

I have read a fair bit about this and there seems to be some contradiction. Obviously with the update api approach you are going to get a lot more requests to the cluster, however due to replication and the refresh interval being set the Bulk Api might not be the best approach either.

One thing I was considering was queuing our updates until a certain threshold was meant (or if the threshold isnt met then a certain time elapses) then use the Bulk api, but wanted to see if there are any other approaches/best practices.

Thanks in advance

Chris

forloop · October 22, 2017, 10:08pm

Have you considered sending updates with the Bulk API?

ChrisFord · October 23, 2017, 8:41am

Hi Russ,

Yeah currently we are using the Bulk API for updates as well. We are currently finding that we have CPU spikes when we are creating and populating a new index while at the same time keeping the current 'live' index up to date, which after a while causes the elasticsearch node to go red. One option I was looking at was reducing the number of records in the sync process, however like I say I have read some blogs that say once the refresh interval and replicas are set you shouldnt use the Bulk API anymore.

Thanks

Chris

system · November 20, 2017, 8:41am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Refresh On Bulk Update is Good or bad Elasticsearch	3	1677	August 25, 2017
Best way to bulk insert? Elasticsearch	13	6401	July 6, 2017
Partial update in Index bulk Elasticsearch	4	2517	January 21, 2020
Create document via bulk vs index api Elasticsearch	2	503	December 20, 2019
Update vs Index Elasticsearch	2	6231	April 2, 2020

When to use the Bulk API

Related topics