Bulk indexing performance

Hello everybody,

We have an indexing process that reads data batches from some datasource and index this data on an Elasticsearch cluster using bulk index requests (let's say one bulk per batch). The data in a same batch could go to several indices.

While doing some testing we've noticed an important degradation on the bulk index request performance when the documents in the batch go to several indices. The more mixed the batch is (and therefore the bulk) the more slow the bulk index request is.

Can anyone explain us this behavior?

Thanks in advance,
Best,
Augusto.-

1 Like

This should not be the case. What version? When you say "several indices", can you be more precise (what's an upper bound on the number of indices)?

1 Like

Elasticsearch version 2.2. Our cluster has 11 nodes (15gb ram and 8 cpu cores per node). We have around 400 indices.

In some test we did we had documents for every index (400) in a same bulk. Another important detail is that we are performing around 500 bulk index requests per second.

Best,
Augusto.-

What is the size of your bulk requests? Do you have fairly static mappings, or are these updated frequently?

The bulk requests are using default configs (1k documents per bulk) and our mappings are static. Also we have the index auto-creation property disabled.

I think I know what might be happening. Can you take a few hot thread traces while repeatedly executing bulk requests with payloads targeting many indices, and again targeting one or a small number of indices?

The main differences we found between the two traces were pieces like this:

[2017-01-12 15:39:37,595][TRACE][index.shard              ] [node-5] [logs_2015-08_performance_0][2] skip writing shard state, has been written before; previous version:  [4] current version [4]
[2017-01-12 15:39:37,595][TRACE][index.shard              ] [node-5] [logs_2015-08_performance_0][26] skip writing shard state, has been written before; previous version:  [4] current version [4]
[2017-01-12 15:39:37,595][TRACE][index.shard              ] [node-5] [logs_2015-08_performance_0][18] skip writing shard state, has been written before; previous version:  [4] current version [4]


.......


[2017-01-12 15:40:00,371][TRACE][transport.tracer         ] [node-5] [2015940][indices:data/write/bulk[s]] sent error response (error: [[node-5][10.75.114.160:9305][indices:data/write/bulk[s][p]]])
[2017-01-12 15:40:00,371][TRACE][transport.tracer         ] [node-5] [2015949][indices:data/write/bulk[s]] received request
[2017-01-12 15:40:00,371][TRACE][transport.tracer         ] [node-5] [1995643][indices:data/write/bulk[s][p]] received response from [{node-5}{IOKuMzJbRjesv8NInGiuBQ}{10.75.114.160}{10.75.114.160:9
305}{ips_id=6D46:6324, master=false}]
[2017-01-12 15:40:00,375][TRACE][indices.breaker.request  ] [request] Adjusted breaker by [16440] bytes, now [16440]
[2017-01-12 15:40:00,375][TRACE][index.store.deletes      ] [node-5][logs_2003-10_performance_0][14] StoreDirectory.deleteFile: delete file _2.fdt
[2017-01-12 15:40:00,375][TRACE][transport.tracer         ] [node-5] [1995649][indices:data/write/bulk[s][p]] received response from [{node-5}{IOKuMzJbRjesv8NInGiuBQ}{10.75.114.160}{10.75.114.160:9
305}{ips_id=6D46:6324, master=false}]

.......


2017-01-12 15:40:00,416][TRACE][indices.breaker.request  ] [request] Adjusted breaker by [-16440] bytes, now [0]
[2017-01-12 15:40:00,417][TRACE][transport.tracer         ] [node-5] [1994226][indices:data/write/bulk[s][p]] received response from [{node-5}{IOKuMzJbRjesv8NInGiuBQ}{10.75.114.160}{10.75.114.160:9
305}{ips_id=6D46:6324, master=false}]
[2017-01-12 15:40:00,417][TRACE][indices.breaker.request  ] [request] Adjusted breaker by [16440] bytes, now [16440]
[2017-01-12 15:40:00,417][TRACE][transport.tracer         ] [node-5] [2004520][indices:data/write/bulk[s]] sent response
[2017-01-12 15:40:00,417][TRACE][indices.breaker.request  ] [request] Adjusted breaker by [-16440] bytes, now [0]
[2017-01-12 15:40:00,417][TRACE][transport.tracer         ] [node-5] [1993906][indices:data/write/bulk[s][p]] received response from [{node-5}{IOKuMzJbRjesv8NInGiuBQ}{10.75.114.160}{10.75.114.160:9
305}{ips_id=6D46:6324, master=false}]
[2017-01-12 15:40:00,417][TRACE][indices.breaker.request  ] [request] Adjusted breaker by [16440] bytes, now [16440]
[2017-01-12 15:40:00,417][TRACE][transport.tracer         ] [node-5] [2001951][indices:data/write/bulk[s]] sent response
[2017-01-12 15:40:00,418][DEBUG][indices.memory           ] [node-5] recalculating shard indexing buffer, total is [1.4gb] with [195] active shards, each shard set to indexing=[7.7mb], translog=[64
kb]
[2017-01-12 15:40:00,418][DEBUG][index.shard              ] [node-5] [logs_2000-12_performance_0][2] updating index_buffer_size from [7.8mb] to [7.7mb]; IndexWriter now using [427204] bytes
[2017-01-12 15:40:00,418][DEBUG][index.shard              ] [node-5] [logs_2000-12_performance_0][18] updating index_buffer_size from [7.8mb] to [7.7mb]; IndexWriter now using [431760] bytes
[2017-01-12 15:40:00,418][DEBUG][index.shard              ] [node-5] [logs_2000-12_performance_0][26] updating index_buffer_size from [7.8mb] to [7.7mb]; IndexWriter now using [431760] bytes
[2017-01-12 15:40:00,418][DEBUG][index.shard              ] [node-5] [logs_2000-12_performance_0][10] updating index_buffer_size from [7.8mb] to [7.7mb]; IndexWriter now using [430140] bytes
[2017-01-12 15:40:00,419][DEBUG][index.shard              ] [node-5] [logs_2012-05_performance_0][24] updating index_buffer_size from [7.8mb] to [7.7mb]; IndexWriter now using [425584] bytes
[2017-01-12 15:40:00,419][TRACE][index.engine.lucene.iw   ] [node-5] [logs_1998-01_performance_0][16] elasticsearch[node-5][refresh][T#7] DW: publishFlushedSegment seg-private updates=null
[2017-01-12 15:40:00,419][TRACE][index.engine.lucene.iw   ] [node-5] [logs_1998-01_performance_0][16] elasticsearch[node-5][refresh][T#7] IW: publishFlushedSegment

Okay, that was not what I'm looking for, but it still gives me information to conclude that it was what I suspected.

I was looking for API responses from the hot threads API. If you can still provide this and simultaneously turn on trace logging for org.elasticsearch.indices.IndexingMemoryController (as this is exactly where my suspicions are), we can nail this down even tighter. It's not necessary though, I think we already have it.

Here's what I think is happening: there is a memory budget that is shared among all index writers for all active shards. If you're writing to 400 indices, that budget is shared across the active shards on these 400 indices. I don't know how many shards you have, but the default budget is 10% of the heap and if you divide that across 400 shards (assuming 1 shard per index) or 2000 shards (assuming the default of 5 shards per index), you are left with a very small budget per shard. These means that the shards are constantly flushing, and this is terrible for performance.

You either need to change your write patterns (so that not all shards are active at the same time) or increase the memory budget (via indices.memory.index_buffer_size; default is 10% of the heap).

I hope that helps.

2 Likes

Wow thanks! We will try that and keep you posted. Just FYI we have 30 shards per index.

Yeah, so I do think my explanation has pretty much covered your situation.

I do think you probably have too many shards per index. Cutting that down will also help increase the budget per active shard.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.