Understanding Why Bulk Update is so Fast

Ishan_Durugkar · June 19, 2014, 5:31am

Hi,
I need help understanding the ES\Lucene update operation.

I have a document structure where I have lots of documents with the same
content that I have to index, with different names.

Structure:
"file":{

"content": {

"type": "string"

},

"name": {

"type": "string"

}

Now if I index all the documents separately, it takes close to 10 hours,
but if I index just the unique content. and then update and add to the
'name' field for every repeated content, then the indexing takes just 45
minutes or so. The update operations are being sent as Bulk.

I have checked and all the names are added to the 'name' field.
How is the update operation happening so fast? I thought update internally
deletes the old document and creates a new one?

ES settings: 1 node 3 shard
Heap size: 4GB

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3f1494fd-8476-413c-a37c-21fb789d8074%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elasticsearch bulk update is extremely slow Elasticsearch	11	11701	April 10, 2017
Bulk update performance Elasticsearch	1	912	January 9, 2019
Bulk update is too slow elasticsearch 6.2 Elasticsearch	25	6829	June 4, 2018
Updating only a few fields out of many Elasticsearch	4	370	November 21, 2023
Bulk indexing size? Elasticsearch	5	329	July 6, 2017

Understanding Why Bulk Update is so Fast

Related topics