Update/Upsert Performance Improvements

strfx · September 17, 2015, 12:18pm

Hi you all

I'm having data that is very frequently updated, so I use bulk updates (50k documents, ~25MB) to update the data in elasticsearch.

If a document is already present, I use scripted updates (to increase a counter) and if not, I just use the upsert-document.

While this works great on a fresh index (one bulk needs about 15sec), the second bulk (which mostly consists of updates) needs around 3-4 minutes for each bulk.

Elasticsearch is running on a single server (36GB RAM, 20GB Heap Size, 24 cores, dedicated 1GBit/s NIC)

My elasticsearch.yml

<redacted>
## Threading
threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: 100

# Bulk pool
threadpool.bulk.type: fixed
threadpool.bulk.size: 60
threadpool.bulk.queue_size: 300

# Index pool
threadpool.index.type: fixed
threadpool.index.size: 20
threadpool.index.queue_size: 100

# Indices settings
indices.memory.index_buffer_size: 30%
indices.memory.min_shard_index_buffer_size: 12mb
indices.memory.min_index_buffer_size: 96mb

index.translog.flush_threshold_ops: 50000
</redacted>

The script looks like this:

{ "script" : "ctx._source.count += %d; ctx._source.touch = timestamp", "params": { "timestamp" : %d }}

Before updating the documents, refresh_interval is set to -1.

I've been monitoring the progress using the great bigdesk plugin and didn't noticed any changes: The threadpool is fine (no queued requests), the GC is not different et cetera.

Do you have any more hints where I could look for this bottleneck? Can I provide further details?

Cheers

nik9000 · September 17, 2015, 12:44pm

From here you'll have to learn how to read java stack traces and identify hot spots from them. There are two tools available to you right now: the hot_threads api and jstack.

hot_threads attempts to guess which threads are causing trouble and gets you a snapshot of them. It works fine when one action is slow but if you have lots of actions that are slow but faster than the hot_threads window then it doesn't work well and you have to use jstack.

jstack you have to run multiple times yourself and do manual thread classification. That isn't has hard as it sounds - I've done it with sed.

nik9000 · September 17, 2015, 12:57pm

Also have a look at the Elasticsearch logs to see if it is logging messages about merges falling behind. If it is then you might want to have a look at merge throttling.

strfx · September 17, 2015, 1:06pm

I haven't found any indicators of merges falling behind.. Many thanks for those tools, I'm currently diving into them!

hot_threads is already telling me that elasticsearch is pretty busy with bulking

   17.9% (89.5ms out of 500ms) cpu usage by thread 'elasticsearch[Abe Brown][bulk][T#5]'
 10/10 snapshots sharing following 15 elements
   groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:256)
   groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:245)
   groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:203)
   org.elasticsearch.script.groovy.GroovyScriptEngineService.compile(GroovyScriptEngineService.java:148)
   org.elasticsearch.script.ScriptService.getCompiledScript(ScriptService.java:409)
   org.elasticsearch.script.ScriptService.compile(ScriptService.java:396)
   org.elasticsearch.script.ScriptService.executable(ScriptService.java:518)
   org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:183)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardUpdateOperation(TransportShardBulkAction.java:523)
   org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:239)
   org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:512)
   org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:419)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   java.lang.Thread.run(Thread.java:745)

I'll take further look with jstack

Many thanks!

nik9000 · September 17, 2015, 1:27pm

I think its telling you something is up with groovy. It looks like its doing a lot of compiling maybe you should replace your script with

{ "script" : "ctx._source.count += increment; ctx._source.touch = timestamp", "params": { "timestamp" : %d, "increment": %d }}

BTW the %d makes me think you are building the whole json blob that with string substitution. That is probably safe for things like this but you have to be super careful with escaping. Going with a json building library is probably safer.

strfx · September 17, 2015, 3:51pm

Wow, it worked out! Many thanks for this one! Those tools are going to be on my toolbelt from now

Thanks for the hint about the escaping within the json building.

nik9000 · September 17, 2015, 4:04pm

I'm glad it worked for you! What does the performance look like now, btw?

strfx · September 17, 2015, 4:21pm

I'm hitting around 50-60seconds per bulk, which suits our needs pretty good.

Topic		Replies	Views
Bulk update is too slow elasticsearch 6.2 Elasticsearch	25	6849	June 4, 2018
Elasticsearch bulk update is extremely slow Elasticsearch	11	11788	April 10, 2017
Slow upserts Elasticsearch	2	1960	July 5, 2017
Bulk update performance Elasticsearch	1	915	January 9, 2019
Write Performance at Scale Elasticsearch	5	397	June 21, 2018

Update/Upsert Performance Improvements

Related topics